DataStates / datastates-llmLinks
LLM checkpointing for DeepSpeed/Megatron
☆17Updated this week
Alternatives and similar repositories for datastates-llm
Users that are interested in datastates-llm are comparing it to the libraries listed below
Sorting:
- A resilient distributed training framework☆95Updated last year
- Dynamic resources changes for multi-dimensional parallelism training☆25Updated 7 months ago
- ☆31Updated last month
- ☆25Updated last year
- Stateful LLM Serving☆73Updated 3 months ago
- ☆84Updated 3 years ago
- ☆53Updated 4 years ago
- ☆62Updated last year
- Complete GPU residency for ML.☆17Updated last week
- NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading☆39Updated last week
- An interference-aware scheduler for fine-grained GPU sharing☆140Updated 5 months ago
- ☆16Updated 2 years ago
- Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“☆61Updated last year
- ☆15Updated 2 years ago
- ☆104Updated 7 months ago
- Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.☆67Updated 3 months ago
- [ICML 2024] Serving LLMs on heterogeneous decentralized clusters.☆26Updated last year
- A lightweight design for computation-communication overlap.☆143Updated last week
- ☆14Updated 3 years ago
- DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.☆52Updated 10 months ago
- ☆32Updated last year
- ☆74Updated 4 years ago
- Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines☆19Updated last year
- ☆9Updated last year
- ☆44Updated 11 months ago
- Scalable long-context LLM decoding that leverages sparsity—by treating the KV cache as a vector storage system.☆54Updated last week
- Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]☆25Updated 7 months ago
- ☆24Updated 2 years ago
- PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".☆88Updated 2 years ago
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆39Updated 2 years ago