cakeng / ASPENLinks
This is the proof-of-concept CPU implementation of ASPEN used for the NeurIPS'23 paper ASPEN: Breaking Operator Barriers for Efficient Parallelization of Deep Neural Networks.
☆13Updated last year
Alternatives and similar repositories for ASPEN
Users that are interested in ASPEN are comparing it to the libraries listed below
Sorting:
- ☆26Updated 2 years ago
- ☆41Updated 5 years ago
- Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …☆36Updated 5 months ago
- Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines☆19Updated 2 years ago
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆35Updated 3 years ago
- ☆38Updated 7 months ago
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches☆80Updated 2 years ago
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆25Updated 8 months ago
- [ACM EuroSys 2023] Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access☆56Updated 6 months ago
- Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling☆12Updated last year
- ☆21Updated 3 years ago
- ☆10Updated 2 years ago
- LLM serving cluster simulator☆135Updated last year
- Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies (EuroSys '2…☆15Updated 2 years ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆27Updated 3 years ago
- ☆37Updated 3 months ago
- AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)☆93Updated 2 years ago
- ☆14Updated 4 years ago
- Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"☆77Updated 3 months ago
- ☆16Updated 9 months ago
- libsmctrl论文的复现,添加了python端接口,可以在python端灵活调用接口来分配计算资源☆12Updated last year
- Source code for the paper: "A Latency-Predictable Multi-Dimensional Optimization Framework forDNN-driven Autonomous Systems"☆22Updated 5 years ago
- Compiler for Dynamic Neural Networks☆45Updated 2 years ago
- Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion☆32Updated last year
- This is a list of awesome edgeAI inference related papers.☆98Updated 2 years ago
- Thunder Research Group's Collective Communication Library☆47Updated 7 months ago
- ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch☆39Updated 10 months ago
- ☆81Updated 8 months ago
- ☆53Updated last year
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆41Updated last year