hao-ai-lab / MuxServe
☆30Updated 3 months ago
Related projects: ⓘ
- ☆30Updated this week
- Stateful LLM Serving☆25Updated last month
- PyTorch library for cost-effective, fast and easy serving of MoE models.☆90Updated last month
- A resilient distributed training framework☆78Updated 5 months ago
- ☆47Updated 3 weeks ago
- Python package for rematerialization-aware gradient checkpointing☆22Updated 10 months ago
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆92Updated 6 months ago
- PyTorch bindings for CUTLASS grouped GEMM.☆41Updated 3 weeks ago
- ☆15Updated this week
- A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems☆110Updated last month
- Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines.☆41Updated 9 months ago
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.