cakeng / ASPENLinks
This is the proof-of-concept CPU implementation of ASPEN used for the NeurIPS'23 paper ASPEN: Breaking Operator Barriers for Efficient Parallelization of Deep Neural Networks.
☆11Updated last year
Alternatives and similar repositories for ASPEN
Users that are interested in ASPEN are comparing it to the libraries listed below
Sorting:
- ☆25Updated last year
- LLM Inference analyzer for different hardware platforms☆80Updated last week
- ☆40Updated 4 years ago
- TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.☆14Updated 7 months ago
- Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling☆13Updated last year
- ☆19Updated 3 years ago
- ☆10Updated last year
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆23Updated 2 months ago
- ☆14Updated 3 years ago
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆35Updated 2 years ago
- Official Repo for "LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization"☆34Updated 2 weeks ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆26Updated 2 years ago
- LLM serving cluster simulator☆107Updated last year
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆52Updated last year
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆18Updated 2 years ago
- ☆14Updated 3 years ago
- Artifacts of EVT ASPLOS'24☆26Updated last year
- ☆149Updated 11 months ago
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆40Updated last year
- Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion☆32Updated last year
- zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices [MobiSys'21] - Artifact Evaluation☆25Updated 4 years ago
- Repository for MLCommons Chakra schema and tools☆39Updated last year
- MobiSys#114☆21Updated last year
- ASPLOS'24: Optimal Kernel Orchestration for Tensor Programs with Korch☆37Updated 3 months ago
- AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)☆82Updated 2 years ago
- Binary Neural Network-based COVID-19 Face-Mask Wear and Positioning Predictor on Edge Devices☆12Updated 4 years ago
- ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)☆41Updated 7 months ago
- An Attention Superoptimizer☆22Updated 6 months ago
- Microsoft Collective Communication Library☆64Updated 7 months ago
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆40Updated 2 years ago