aliyun / aicb
☆192Updated 2 months ago
Alternatives and similar repositories for aicb:
Users that are interested in aicb are comparing it to the libraries listed below
- ☆40Updated 4 months ago
- ☆50Updated last month
- ☆435Updated 2 weeks ago
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches☆71Updated last year
- ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale☆327Updated 3 weeks ago
- Repository for MLCommons Chakra schema and tools☆92Updated this week
- NS3 simulator for RDMA load balancing☆57Updated 5 months ago
- ☆278Updated last year
- NCCL Profiling Kit☆127Updated 8 months ago
- ☆48Updated 8 months ago
- This is an RDMA program written in Python, based on the Pyverbs provided by the rdma-core(https://github.com/linux-rdma/rdma-core) reposi…☆29Updated 3 years ago
- ☆186Updated 5 years ago
- example code for using DC QP for providing RDMA READ and WRITE operations to remote GPU memory☆120Updated 7 months ago
- ☆62Updated 3 years ago
- [NSDI 2023] TopoOpt: Optimizing the Network Topology for Distributed DNN Training☆28Updated 6 months ago
- ☆50Updated last year
- Curated collection of papers in machine learning systems☆262Updated 2 weeks ago
- Synthesizer for optimal collective communication algorithms☆106Updated 11 months ago
- ☆159Updated last year
- NS3 simulator for RDMA over Converged Ethernet v2 (RoCEv2), including the implementation of DCQCN, TIMELY, PFC, ECN and shared buffer swi…☆287Updated 6 years ago
- Repository for MLCommons Chakra schema and tools☆39Updated last year
- A fast and user-transparent parallel simulator implementation for ns-3☆72Updated 5 months ago
- ☆26Updated 9 months ago
- ☆13Updated 9 months ago
- LLM serving cluster simulator☆93Updated 10 months ago
- RDMA and SHARP plugins for nccl library☆183Updated last month
- Artifacts for our NSDI'23 paper TGS☆74Updated 9 months ago
- An interference-aware scheduler for fine-grained GPU sharing☆127Updated last month
- NS3 implementation of Homa Transport Protocol☆23Updated 10 months ago