gbxu/autoccl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/gbxu/autoccl)

gbxu / autoccl

[NSDI25] AutoCCL: Automated Collective Communication Tuning for Accelerating Distributed and Parallel DNN Training

☆32

Alternatives and similar repositories for autoccl

Users that are interested in autoccl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

stepfun-ai / InfiniteHBD-Trace
View on GitHub
☆17May 21, 2025Updated last year
H-Huang / torch_collective_extension
View on GitHub
A minimum demo for PyTorch distributed extension functionality for collectives.
☆15Jul 29, 2024Updated last year
spcl / muliticast-based-allgather
View on GitHub
☆24Feb 12, 2025Updated last year
netiken / m4
View on GitHub
[TBD] "m4: A Learned Flow-level Network Simulator" by Chenning Li, Anton A. Zabreyko, Om Chabra, Arash Nasr-Esfahany, Kevin Zhao, Pratees…
☆21Jun 19, 2026Updated last month
netiken / m3
View on GitHub
[ACM SIGCOMM 2024] "m3: Accurate Flow-Level Performance Estimation using Machine Learning" by Chenning Li, Arash Nasr-Esfahany, Kevin Zha…
☆25Oct 2, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
sii-research / VCCL
View on GitHub
Venus Collective Communication Library, supported by SII and Infrawaves.
☆151Jun 24, 2026Updated 3 weeks ago
hpdps-group / COCCL
View on GitHub
COCCL: Compression and precision co-aware collective communication library
☆36Jul 7, 2026Updated 2 weeks ago
epfml / powergossip
View on GitHub
Code for "Practical Low-Rank Communication Compression in Decentralized Deep Learning"
☆17Aug 4, 2020Updated 5 years ago
spcl / crosspipe
View on GitHub
Official implementation of CrossPipe: Towards Optimal Pipeline Schedules for Cross-Datacenter Training (ATC '25), built on top of Megatro…
☆17Jul 6, 2025Updated last year
astra-sim / tacos
View on GitHub
TACOS: [T]opology-[A]ware [Co]llective Algorithm [S]ynthesizer for Distributed Machine Learning
☆37Jun 13, 2025Updated last year
muriloboratto / NVSHEMEM
View on GitHub
Sample Codes using NVSHMEM on Multi-GPU
☆30Jan 22, 2023Updated 3 years ago
jonlanglet / DTA
View on GitHub
This is the repository for Direct Telemetry Access, a high-speed network telemetry collection system.
☆27Apr 6, 2025Updated last year
aliyun / syccl
View on GitHub
☆24Sep 10, 2025Updated 10 months ago
eth-easl / sailor
View on GitHub
AI model training on heterogeneous, geo-distributed resources
☆46Nov 24, 2025Updated 7 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
opencomputeproject / OCP-Multipath-Reliable-Connection
View on GitHub
Multipath Reliable Connection (MRC) extends InfiniBand Reliable Connection semantics so a single RDMA connection can spray traffic across…
☆21Jun 8, 2026Updated last month
astra-sim / stage
View on GitHub
STAGE: A Symbolic Tensor grAph GEnerator for distributed AI system co-design
☆48Jun 27, 2026Updated 3 weeks ago
uccl-project / uccl
View on GitHub
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g…
☆1,467Updated this week
aliyun / aicb
View on GitHub
☆237Jul 2, 2026Updated 2 weeks ago
NASP-THU / multiverse
View on GitHub
GPU-accelerated LLM Training Simulator
☆52Jun 26, 2025Updated last year
echo17666 / BUAA2022-SysY-Compiler
View on GitHub
A SysY Compiler written by Java for the Compiler Technology Course in BUAA
☆20Sep 18, 2023Updated 2 years ago
hyxcl / nsys_recipes
View on GitHub
these are custom recipes of nvidia nsight system post collection analysis.
☆16Nov 7, 2025Updated 8 months ago
ap0stader / ASysyCompilerJudge
View on GitHub
A Sysy Compiler Judge. By @ap0stader & @swkfk
☆22Jan 12, 2025Updated last year
phoenix-dataplane / mCCS
View on GitHub
Managed collective communication service
☆24Sep 2, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Sachin-A / TraceWeaver
View on GitHub
TraceWeaver is a research prototype for transparently tracing requests through a microservice without application instrumentation.
☆23Sep 2, 2024Updated last year
VITA-Group / READ-ME
View on GitHub
[NeurIPS2024] "Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design", Ruisi Cai, Yeonju Ro, Geon-Woo …
☆16Dec 16, 2024Updated last year
hwang595 / ATOMO
View on GitHub
Atomo: Communication-efficient Learning via Atomic Sparsification
☆29Dec 9, 2018Updated 7 years ago
axio-project / FuseLink
View on GitHub
Efficient GPU communication over multiple NICs.
☆29Nov 20, 2025Updated 8 months ago
ugonfor / DGQ
View on GitHub
[ICLR 2025] DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
☆19Mar 25, 2025Updated last year
denght23 / CAVER
View on GitHub
NS3 simulator for RDMA load balancing
☆12Jan 31, 2025Updated last year
CarpenterLee / rdma_examples
View on GitHub
☆14Oct 23, 2023Updated 2 years ago
nex-agi / NexVenusCL
View on GitHub
Nex Venus Communication Library
☆75Nov 17, 2025Updated 8 months ago
hjlogzw / DPDK-TCP-UDP_Protocol_Stack
View on GitHub
Simple protocol stack based on dpdk（使用dpdk搭建协议栈）
☆33Apr 25, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
YJHMITWEB / ExFlow
View on GitHub
Explore Inter-layer Expert Affinity in MoE Model Inference
☆16May 6, 2024Updated 2 years ago
alibaba / alibaba-lingjun-dataset-2023
View on GitHub
☆67Jun 25, 2024Updated 2 years ago
uccl-project / rdmatop
View on GitHub
htop-like TUI for real-time RDMA network monitoring.
☆76Jul 12, 2026Updated last week
firesim / icenet
View on GitHub
Network components (NIC, Switch) for FireBox
☆19Oct 27, 2024Updated last year
perplexityai / pplx-kernels
View on GitHub
Perplexity GPU Kernels
☆591Nov 7, 2025Updated 8 months ago
Thesys-lab / Helix-ASPLOS25
View on GitHub
Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"
☆93Oct 15, 2025Updated 9 months ago
eth-easl / deltazip
View on GitHub
Compression for Foundation Models
☆36Jul 21, 2025Updated 11 months ago