Elixir: Train a Large Language Model on a Small GPU Cluster
☆15Jun 8, 2023Updated 2 years ago
Alternatives and similar repositories for Elixir
Users that are interested in Elixir are comparing it to the libraries listed below
Sorting:
- PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.☆10Feb 10, 2022Updated 4 years ago
- ☆26Aug 31, 2023Updated 2 years ago
- Deft: A Scalable Tree Index for Disaggregated Memory☆23Apr 23, 2025Updated 10 months ago
- A Sparse-tensor Communication Framework for Distributed Deep Learning☆13Nov 1, 2021Updated 4 years ago
- pytorch implementation of mvp: a multi-stage vision-language pre-training framework☆11Apr 23, 2022Updated 3 years ago
- [ICDCS 2023] DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining☆12Dec 4, 2023Updated 2 years ago
- An external memory allocator example for PyTorch.☆16Aug 10, 2025Updated 6 months ago
- A memory efficient DLRM training solution using ColossalAI☆107Nov 22, 2022Updated 3 years ago
- FTPipe and related pipeline model parallelism research.☆44May 16, 2023Updated 2 years ago
- Official resporitory for "IPDPS' 24 QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices".☆20Feb 23, 2024Updated 2 years ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆17Mar 13, 2023Updated 2 years ago
- ☆24Nov 22, 2022Updated 3 years ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆120Mar 13, 2024Updated last year
- ☆251Jul 25, 2024Updated last year
- An auxiliary project analysis of the characteristics of KV in DiT Attention.☆33Nov 29, 2024Updated last year
- Artifact for PPoPP22 QGTC: Accelerating Quantized GNN via GPU Tensor Core.☆30Feb 12, 2022Updated 4 years ago
- Ancestral Gumbel-Top-k Sampling☆25Apr 11, 2020Updated 5 years ago
- ML Input Data Processing as a Service. This repository contains the source code for Cachew (built on top of TensorFlow).☆40Sep 10, 2024Updated last year
- AgentHive provides the primitives and helpers for a seamless usage of robohive within TorchRL.☆35Jan 12, 2024Updated 2 years ago
- PetPS: Supporting Huge Embedding Models with Tiered Memory☆33May 21, 2024Updated last year
- ☆13Dec 13, 2022Updated 3 years ago
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- Continuous Pipelined Speculative Decoding☆16Jan 4, 2026Updated last month
- netbeacon - monitoring your network capture, NIDS or network analysis process☆19Oct 26, 2013Updated 12 years ago
- Notes and Examples to get started Parallel Computing with CUDA.☆13Nov 1, 2019Updated 6 years ago
- Code accompanying the NeurIPS 2019 paper AutoAssist: A Framework to Accelerate Training of Deep Neural Networks.☆14Oct 3, 2022Updated 3 years ago
- Face Swap☆12Jun 2, 2023Updated 2 years ago
- Rust CLI tool for syncing Claude Code conversation history across machines using git repositories.☆20Updated this week
- ☆10May 16, 2021Updated 4 years ago
- ☆10Jun 4, 2021Updated 4 years ago
- Text Classification and NLP in Tensorflow☆10Jul 20, 2018Updated 7 years ago
- Paper Review about Speech Recognition · NLP☆10Mar 25, 2021Updated 4 years ago
- Proposal for the next generation of course-oriented IR.☆10Dec 24, 2021Updated 4 years ago
- ☆15Jul 18, 2023Updated 2 years ago
- Chaitin-Briggs register-allocation algorithm (LLVM back-end)☆12Jan 6, 2016Updated 10 years ago
- Peking University Convex Optimization Course given by Professor Wen Zaiwen☆11Jan 11, 2018Updated 8 years ago
- 🛠Robust SSH: auto-reconnect SSH session that preserves your running shell and command. Intuitive, no server-side setup, aimed at simplic…☆13Nov 14, 2025Updated 3 months ago
- A tool for cross-checking Verilog compilers☆14Apr 16, 2025Updated 10 months ago
- How to plot for papers, slides, demos, etc.☆10Apr 7, 2022Updated 3 years ago