lzhangbv / acpsgdLinks

[ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning

☆10

Alternatives and similar repositories for acpsgd

Users that are interested in acpsgd are comparing it to the libraries listed below

Sorting:

LGrCo / L-GreCo
AN EFFICIENT AND GENERAL FRAMEWORK FOR LAYERWISE-ADAPTIVE GRADIENT COMPRESSION
☆14Updated 2 years ago
yxli2123 / LoSparse
☆61Updated 2 years ago
DS3Lab / AC-SGD
Code associated with the paper **Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees**.
☆27Updated 2 years ago
Ying1123 / llm-caching-multiplexing
☆20Updated 2 years ago
bytedance / QSync
Official resporitory for "IPDPS' 24 QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices".
☆20Updated last year
SymbioticLab / ModelKeeper
A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup
☆35Updated 2 years ago
SqueezeAILab / SqueezedAttention
[ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference
☆54Updated 11 months ago
hku-systems / naspipe
☆14Updated 3 years ago
kazukiosawa / pipe-fisher
☆10Updated 2 years ago
llm-db / FineInfer
Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)
☆19Updated last year
vineeths96 / Gradient-Compression
We present a set of all-reduce compatible gradient compression algorithms which significantly reduce the communication overhead while mai…
☆10Updated 3 years ago
DerrickYLJ / TidalDecode
[ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
☆48Updated 2 months ago
DS3Lab / Decentralized_FM_alpha
☆19Updated 2 years ago
zyxxmu / cam
Pytorch implementation of our paper accepted by ICML 2024 -- CaM: Cache Merging for Memory-efficient LLMs Inference
☆47Updated last year
sjtu-epcc / DVABatch
☆21Updated 3 years ago
ParCIS / Ok-Topk
Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…
☆27Updated 2 years ago
kimihe / Octo
Create tiny ML systems for on-device learning.
☆20Updated 4 years ago
lzhangbv / dear_pytorch
[ICDCS 2023] DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining
☆11Updated last year
zhuzilin / pytorch-malloc
An external memory allocator example for PyTorch.
☆16Updated 2 months ago
hmarkc / parallel-prompt-decoding
Efficient LLM Inference Acceleration using Prompting
☆50Updated last year
UbiquitousLearning / Paper-list-resource-efficient-large-language-model
☆101Updated last year
fmfi-compbio / admm-pruning
☆30Updated last year
NiuChaoyue / Secure-Federated-Submodel-Learning
☆15Updated 4 years ago
falcon-xu / early-exit-papers
A curated list of early exiting (LLM, CV, NLP, etc)
☆67Updated last year
YuhanLiu11 / AutoFreeze
☆22Updated 4 years ago
mlpen / LookupFFN
☆20Updated last year
VITA-Group / Q-Hitter
☆15Updated last year
casys-kaist / EnvPipe
☆25Updated 2 years ago
dywsjtu / apparate
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
☆25Updated 11 months ago
S-Lab-System-Group / Hydro
Surrogate-based Hyperparameter Tuning System
☆27Updated 2 years ago