SophiaLi06 / BytePS_THCLinks

THC: Accelerating Distributed Deep Learning Using Tensor Homomorphic Compression

☆19

Alternatives and similar repositories for BytePS_THC

Users that are interested in BytePS_THC are comparing it to the libraries listed below

Sorting:

zhuangwang93 / Espresso
Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies (EuroSys '2…
☆15Updated 2 years ago
phoenix-dataplane / mCCS
Managed collective communication service
☆22Updated last year
alibaba / alibaba-lingjun-dataset-2023
☆60Updated last year
Per-Packet-AI / Caravan-Artifact-OSDI24
☆18Updated last year
microsoft / TE-CCL
☆41Updated last year
netiken / m4
[TBD] "m4: A Learned Flow-level Network Simulator" by Chenning Li, Anton A. Zabreyko, Om Chabra, Arash Nasr-Esfahany, Kevin Zhao, Pratees…
☆15Updated this week
SymbioticLab / ModelKeeper
A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup
☆35Updated 2 years ago
Thesys-lab / Helix-ASPLOS25
Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"
☆68Updated last week
netiken / m3
[ACM SIGCOMM 2024] "m3: Accurate Flow-Level Performance Estimation using Machine Learning" by Chenning Li, Arash Nasr-Esfahany, Kevin Zha…
☆24Updated last year
msr-fiddle / blox
☆43Updated last year
hipersys-team / TopoOpt
[NSDI 2023] TopoOpt: Optimizing the Network Topology for Distributed DNN Training
☆34Updated last year
in-ATP / ATP
☆83Updated 3 years ago
ParCIS / Ok-Topk
Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…
☆27Updated 2 years ago
NASP-THU / multiverse
GPU-accelerated LLM Training Simulator
☆38Updated 4 months ago
JF-D / Proteus
☆23Updated last year
microsoft / taccl
TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches
☆76Updated 2 years ago
sands-lab / omnireduce
☆68Updated 2 years ago
HKUST-SING / herald
Herald: Accelerating Neural Recommendation Training with Embedding Scheduling (NSDI 2024)
☆23Updated last year
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆88Updated 2 years ago
appnet-org / appnet
Expressive, Easy to Build, and High-Performance Application Networks
☆18Updated 3 months ago
msr-fiddle / synergy
☆51Updated 2 years ago
UMass-LIDS / Proteus
Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling
☆13Updated last year
axio-project / FuseLink
Efficient GPU communication over multiple NICs.
☆21Updated 3 months ago
netx-repo / training-bottleneck
Analyze network performance in distributed training
☆19Updated 5 years ago
snowzjx / liteflow
A Hybrid Framework to Build High-performance Adaptive Neural Networks for Kernel Datapath
☆27Updated 2 years ago
in-ATP / switchML
☆32Updated 4 years ago
mutinifni / splitwise-sim
LLM serving cluster simulator
☆116Updated last year
SymbioticLab / Aequitas
Aequitas enables RPC-level QoS in datacenter networks.
☆18Updated 3 years ago
Rivendile / Muri
Artifacts for our SIGCOMM'22 paper Muri
☆43Updated last year
gudiandian / ElasticFlow
☆16Updated last year