[NSDI25] AutoCCL: Automated Collective Communication Tuning for Accelerating Distributed and Parallel DNN Training
☆31May 2, 2025Updated last year
Alternatives and similar repositories for autoccl
Users that are interested in autoccl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆24Feb 12, 2025Updated last year
- Code for "Practical Low-Rank Communication Compression in Decentralized Deep Learning"☆17Aug 4, 2020Updated 5 years ago
- ☆24May 9, 2025Updated last year
- [TBD] "m4: A Learned Flow-level Network Simulator" by Chenning Li, Anton A. Zabreyko, Om Chabra, Arash Nasr-Esfahany, Kevin Zhao, Pratees…☆20Apr 27, 2026Updated 3 weeks ago
- λFS: an elastic, high-performance, serverless-function-based metadata service for large-scale distributed file systems (ACM ASPLOS'23)☆14Apr 2, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆81Sep 15, 2025Updated 8 months ago
- Sample Codes using NVSHMEM on Multi-GPU☆30Jan 22, 2023Updated 3 years ago
- COCCL: Compression and precision co-aware collective communication library☆31Mar 16, 2025Updated last year
- A minimum demo for PyTorch distributed extension functionality for collectives.☆15Jul 29, 2024Updated last year
- MISO: Exploiting Multi-Instance GPU Capability on Multi-Tenant GPU Clusters☆21Apr 21, 2023Updated 3 years ago
- ☆19Oct 2, 2023Updated 2 years ago
- Code for reproducing experiments performed for Accoridon☆13Jun 11, 2021Updated 4 years ago
- Here is the repo for public scripts.☆12Jul 16, 2022Updated 3 years ago
- [ICML 2026] Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning☆33Sep 12, 2025Updated 8 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆13Aug 6, 2022Updated 3 years ago
- ☆10Sep 3, 2017Updated 8 years ago
- libsmctrl论文的复现,添加了python端接口,可以在python端灵活调用接口来分配计算资源☆12May 21, 2024Updated last year
- SocksDirect code repository☆20May 6, 2026Updated last week
- High Performance KV Cache Store for LLM☆53Apr 6, 2026Updated last month
- Flexible, high-performance TCP offload to SmartNICs using fine-grained parallelism☆61Feb 27, 2022Updated 4 years ago
- Accelerated in CUDA☆11Oct 28, 2022Updated 3 years ago
- An interface to program any congestion control protocol for an unreliable connection based protocol sent over UDP. It comes with a clean …☆12Apr 8, 2022Updated 4 years ago
- AI Cluster Observability & Troubleshooting Toolkit. Powered by SII & Infrawaves.☆36Apr 29, 2026Updated 3 weeks ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆14Sep 29, 2017Updated 8 years ago
- Tile-based language built for AI computation across all scales☆149Updated this week
- Cheetah is a system that optimizes queries using programmable switches.☆20Jun 25, 2020Updated 5 years ago
- Slowdown prediction module of Echo: Simulating Distributed Training at Scale☆13May 17, 2025Updated last year
- ☆29May 24, 2024Updated last year
- This repository contains a SystemVerilog implementation of a parametrized Round Robin arbiter with three instantiation options☆13Jan 28, 2024Updated 2 years ago
- Accepted to MLSys 2026☆81Apr 19, 2026Updated last month
- Layer-wise Sparsification of Distributed Deep Learning☆10Jul 6, 2020Updated 5 years ago
- Convolutional 3D autoencoder☆14Aug 21, 2016Updated 9 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [NSDI 2023] TopoOpt: Optimizing the Network Topology for Distributed DNN Training☆40Sep 10, 2024Updated last year
- 一些有趣的页面,使用 Github Pages 和 Vercel 部署☆13Feb 8, 2024Updated 2 years ago
- template for https://cnli.me☆10Feb 27, 2025Updated last year
- Expressive, Easy to Build, and High-Performance Application Networks☆19Jul 1, 2025Updated 10 months ago
- Dark channel Haze removal algorithm with CUDA acceleration (typically 10x or more speedup using a Nvidia GPU)☆14Dec 7, 2017Updated 8 years ago
- This repository contains the Wireshark dissector generator from a P4 file input.☆15Mar 24, 2017Updated 9 years ago
- Language modeling on the Penn Treebank (PTB) corpus using a trigram model with linear interpolation, a neural probabilistic language mode…☆18Oct 8, 2018Updated 7 years ago