ByteDance-Seed / StragglerAnalysisLinks
☆38Updated 3 months ago
Alternatives and similar repositories for StragglerAnalysis
Users that are interested in StragglerAnalysis are comparing it to the libraries listed below
Sorting:
- Stateful LLM Serving☆79Updated 4 months ago
- A lightweight design for computation-communication overlap.☆155Updated last month
- ☆67Updated last year
- kvcached: Elastic KV cache for dynamic GPU sharing and efficient multi-LLM inference.☆35Updated last week
- ☆48Updated 7 months ago
- Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines☆19Updated last year
- A framework for generating realistic LLM serving workloads☆51Updated last month
- ☆209Updated this week
- Microsoft Collective Communication Library☆63Updated 8 months ago
- FlexFlow Serve: Low-Latency, High-Performance LLM Serving☆48Updated last week
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆23Updated 2 months ago
- DeeperGEMM: crazy optimized version☆71Updated 3 months ago
- ☆51Updated 2 months ago
- gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling☆37Updated this week
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆170Updated 10 months ago
- A resilient distributed training framework☆95Updated last year
- ☆37Updated 7 months ago
- Aims to implement dual-port and multi-qp solutions in deepEP ibrc transport☆57Updated 2 months ago
- Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.☆66Updated 4 months ago
- Efficient Compute-Communication Overlap for Distributed LLM Inference☆26Updated last month
- ☆115Updated 9 months ago
- Tile-based language built for AI computation across all scales☆31Updated this week
- ☆60Updated 3 months ago
- ☆25Updated last year
- Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“☆62Updated last year
- NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading☆49Updated last month
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆125Updated last year
- DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆38Updated this week
- ☆109Updated 8 months ago
- A simple calculation for LLM MFU.☆42Updated 5 months ago