[NSDI 2023] TopoOpt: Optimizing the Network Topology for Distributed DNN Training
☆41Sep 10, 2024Updated last year
Alternatives and similar repositories for TopoOpt
Users that are interested in TopoOpt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches☆81Jul 25, 2023Updated 2 years ago
- ☆66Jun 25, 2024Updated last year
- An evaluation framework for data center traffic engineering.☆14Jul 28, 2024Updated last year
- Distributed SDDMM Kernel☆12Jul 8, 2022Updated 3 years ago
- ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale☆606Apr 25, 2026Updated last month
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- GPU-accelerated LLM Training Simulator☆52Jun 26, 2025Updated 11 months ago
- Aequitas enables RPC-level QoS in datacenter networks.☆18Jul 19, 2022Updated 3 years ago
- Bamboo is a system for running large pipeline-parallel DNNs affordably, reliably, and efficiently using spot instances.☆55Dec 11, 2022Updated 3 years ago
- DNN partition edge-cloud co-infer☆11Jun 11, 2023Updated 3 years ago
- Codebase for Teal (SIGCOMM 2023)☆62Apr 19, 2024Updated 2 years ago
- Code for "Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP", which appeared at SOSP 2021☆28Dec 15, 2021Updated 4 years ago
- ☆159Jun 15, 2025Updated last year
- Synthesized data from real-world traces of data-intensive applications for coflow benchmarking.☆27Aug 5, 2015Updated 10 years ago
- ☆234Updated this week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆32Apr 11, 2022Updated 4 years ago
- Predicting the temperature of your system based on factors such as RAM usage,CPU storage temperature,Memory Used and space consumed by th…☆11Apr 23, 2019Updated 7 years ago
- GPU-accelerated LLM Training Simulator☆21Jun 26, 2025Updated 11 months ago
- A fast and user-transparent parallel simulator implementation for ns-3☆108Nov 4, 2025Updated 7 months ago
- Proof of concept, using Sysdig metrics as the decision variable for a Kubernetes scheduler☆14Nov 3, 2017Updated 8 years ago
- Sage of Congestion Control (or How Computers Can Learn from Existing Schemes and Master Internet CC)☆51Nov 17, 2023Updated 2 years ago
- minimal program to monitor and print cpu usage of a container☆12May 19, 2020Updated 6 years ago
- ☆39Nov 28, 2024Updated last year
- ☆85Dec 2, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆25May 26, 2021Updated 5 years ago
- ☆13Dec 18, 2020Updated 5 years ago
- ☆36Jan 4, 2023Updated 3 years ago
- SimEON: Simulator for Elastic Optical Networks☆11Mar 2, 2018Updated 8 years ago
- ☆13Oct 3, 2024Updated last year
- Vivisecting Mobility Management in 5G Cellular Networks (SIGCOMM'22)☆13Jun 26, 2022Updated 3 years ago
- ☆12Mar 27, 2024Updated 2 years ago
- ☆251Jul 25, 2024Updated last year
- Artifact for "Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning" [NSDI '23]☆46Nov 24, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Community portal☆18Jun 7, 2026Updated last week
- Burstable Cloud Scheduler☆17Jun 6, 2024Updated 2 years ago
- python 开发的兵器库. 收藏内容包括参考代码,实验,培训资料等☆12May 13, 2026Updated last month
- Artifacts for our SIGCOMM'23 paper Ditto☆15Oct 17, 2023Updated 2 years ago
- [NSDI'22] Differential Network Analysis☆14Jun 2, 2022Updated 4 years ago
- ☆11Oct 7, 2023Updated 2 years ago
- A collection of tools, code, and documentation to understand the host network on real server hardware.☆46Dec 1, 2024Updated last year