facebookresearch / nccl
Optimized primitives for collective multi-GPU communication
☆21Updated 9 months ago
Alternatives and similar repositories for nccl:
Users that are interested in nccl are comparing it to the libraries listed below
- Pytorch process group third-party plugin for UCC☆20Updated 9 months ago
- Magnum IO community repo☆83Updated last week
- NCCL Profiling Kit☆127Updated 6 months ago
- NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.☆114Updated last year
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆128Updated last week
- RDMA and SHARP plugins for nccl library☆172Updated last week
- GPUDirect Async support for IB Verbs☆95Updated 2 years ago
- A command line utility to manage the configuration of a system's high performance network interfaces for RoCE deployments☆28Updated last year
- pytorch ucc plugin☆18Updated 3 years ago
- ☆36Updated last month
- RCCL Performance Benchmark Tests☆55Updated 2 weeks ago
- Synthesizer for optimal collective communication algorithms☆102Updated 9 months ago
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆160Updated this week
- ☆322Updated 9 months ago
- CloudAI Benchmark Framework☆48Updated this week
- ☆23Updated 3 years ago
- A hierarchical collective communications library with portable optimizations☆26Updated last month
- Mellanox libibverbs☆60Updated 5 years ago
- ☆47Updated 3 months ago
- Unified Collective Communication Library☆223Updated this week
- Microsoft Collective Communication Library☆61Updated 2 months ago
- Microsoft Collective Communication Library☆331Updated last year
- ☆41Updated 8 months ago
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches☆68Updated last year
- NVIDIA DPU OPs collection☆13Updated last year
- A GPU-driven system framework for scalable AI applications☆111Updated last week
- GVProf: A Value Profiler for GPU-based Clusters☆48Updated 10 months ago
- ROCm Communication Collectives Library (RCCL)☆291Updated this week
- Provides a set of benchmarks that can be used to measure the memory bandwidth performance of CPU's☆82Updated 9 months ago
- Demystifying Datapath Accelerator Enhanced Off-path SmartNIC [ICNP24]☆27Updated last month