☆27Aug 25, 2023Updated 2 years ago
Alternatives and similar repositories for CocktailSGD
Users that are interested in CocktailSGD are comparing it to the libraries listed below
Sorting:
- Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)☆19May 28, 2024Updated last year
- ☆13Jan 15, 2025Updated last year
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- This repository contains code for the MicroAdam paper.☆21Dec 14, 2024Updated last year
- ☆10Apr 29, 2024Updated last year
- summer school materials☆46Aug 4, 2023Updated 2 years ago
- Associated codebase for Byzantine-resilient distributed / decentralized machine learning papers from INSPIRE Lab☆15Oct 11, 2021Updated 4 years ago
- DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation☆16Jul 13, 2020Updated 5 years ago
- Code for paper "Byzantine-Resilient Decentralized Stochastic Optimization with Robust Aggregation Rules"☆20Apr 19, 2024Updated last year
- ☆19May 4, 2023Updated 2 years ago
- 🎬 3.7× faster video generation E2E 🖼️ 1.6× faster image generation E2E ⚡ ColumnSparseAttn 9.3× vs FlashAttn‑3 💨 ColumnSparseGEMM 2.5× …☆103Sep 8, 2025Updated 6 months ago
- Code for paper "Byzantine-Resilient Distributed Finite-Sum Optimization over Networks"☆18Nov 5, 2020Updated 5 years ago
- List Flower resources☆12Feb 4, 2022Updated 4 years ago
- [ICLRW'26] EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation☆29Updated this week
- Official implementation for Text Generation Beyond Discrete Token Sampling☆22Aug 11, 2025Updated 7 months ago
- ☆10Jun 19, 2023Updated 2 years ago
- An Ultra-Long Output Reinforcement Learning Approach☆23Jul 31, 2025Updated 7 months ago
- ☆150Jun 2, 2023Updated 2 years ago
- ☆13Jun 8, 2021Updated 4 years ago
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆20Jan 24, 2025Updated last year
- Efficient misspecification uncertainties for linear regression☆16Updated this week
- 🔋🎯 Thread-level, NUMA-aware energy attribution model for multi-tenancy☆55Jul 13, 2023Updated 2 years ago
- Blog post☆17Feb 16, 2024Updated 2 years ago
- ☆27Jul 18, 2025Updated 8 months ago
- The Atlas multi-GPU quantum circuit simulator.☆15Aug 17, 2024Updated last year
- [ICML24] Official Implementation of "ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections"☆16May 31, 2024Updated last year
- fast trainer for educational purposes☆24Mar 12, 2026Updated last week
- [ICLR24] Better Neural PDE Solvers Through Data-Free Mesh Movers☆17Mar 20, 2024Updated 2 years ago
- [ICML 2021] "Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators" by Yonggan Fu, Yonga…☆16Jan 3, 2022Updated 4 years ago
- Inducing Point Operator Transformer: A Flexible and Scalable Architecture for Solving PDEs (AAAI 2024)☆15Jul 30, 2024Updated last year
- Grams: Gradient Descent with Adaptive Momentum Scaling (ICLR 2025 Workshop)☆17Mar 6, 2025Updated last year
- Implementation of (overlap) local SGD in Pytorch☆34Jul 12, 2020Updated 5 years ago
- Code related to ’Beyond spectral gap: The role of the topology in decentralized learning‘.☆14Jun 7, 2022Updated 3 years ago
- ☆23Nov 2, 2019Updated 6 years ago
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆18Apr 16, 2025Updated 11 months ago
- Code for the paper "Secure Distributed Training at Scale" (ICML 2022)☆16Feb 4, 2025Updated last year
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆50Jul 23, 2024Updated last year
- torch implementation of diloco☆22May 31, 2024Updated last year
- ☆33Oct 13, 2025Updated 5 months ago