AMD-AGI / Primus-TurboLinks
☆22Updated last week
Alternatives and similar repositories for Primus-Turbo
Users that are interested in Primus-Turbo are comparing it to the libraries listed below
Sorting:
- ☆121Updated 9 months ago
- A lightweight design for computation-communication overlap.☆177Updated 2 weeks ago
- ☆238Updated last year
- ☆83Updated 2 years ago
- ☆85Updated 6 months ago
- Ultra and Unified CCL☆567Updated this week
- MSCCL++: A GPU-driven communication stack for scalable AI applications☆418Updated this week
- Perplexity GPU Kernels☆476Updated 2 weeks ago
- ☆123Updated 10 months ago
- High performance Transformer implementation in C++.☆134Updated 8 months ago
- Thunder Research Group's Collective Communication Library☆42Updated 2 months ago
- Microsoft Collective Communication Library☆66Updated 10 months ago
- ☆144Updated 4 months ago
- nnScaler: Compiling DNN models for Parallel Training☆118Updated last week
- Microsoft Collective Communication Library☆360Updated 2 years ago
- NCCL Profiling Kit☆145Updated last year
- Dynamic Memory Management for Serving LLMs without PagedAttention☆421Updated 4 months ago
- DeepSeek-V3/R1 inference performance simulator☆170Updated 6 months ago
- A low-latency & high-throughput serving engine for LLMs☆422Updated 4 months ago
- An interference-aware scheduler for fine-grained GPU sharing☆147Updated 8 months ago
- ☆57Updated 4 months ago
- NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer☆130Updated 2 weeks ago
- ☆90Updated 10 months ago
- A GPU-optimized system for efficient long-context LLMs decoding with low-bit KV cache.☆60Updated last month
- Distributed MoE in a Single Kernel [NeurIPS '25]☆49Updated this week
- Examples of CUDA implementations by Cutlass CuTe☆236Updated 3 months ago
- ☆106Updated 4 months ago
- Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.☆66Updated 6 months ago
- ☆108Updated last year
- Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity☆221Updated 2 years ago