UbiquitousLearning / Mandheling-DSP-Training
The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]
☆18Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for Mandheling-DSP-Training
- LLM Inference analyzer for different hardware platforms☆42Updated this week
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆43Updated 5 months ago
- Compiler for Dynamic Neural Networks☆43Updated last year
- DietCode Code Release☆62Updated 2 years ago
- nnScaler: Compiling DNN models for Parallel Training☆74Updated 3 weeks ago
- ☆95Updated 10 months ago
- MobiSys#114☆21Updated last year
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆103Updated 2 years ago
- InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)☆79Updated 4 months ago
- ☆82Updated this week
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆81Updated last year
- Summary of some awesome work for optimizing LLM inference☆37Updated 2 weeks ago
- A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems☆132Updated last month
- ☆131Updated 3 months ago
- ☆41Updated 2 years ago
- Canvas: End-to-End Kernel Architecture Search in Neural Networks☆13Updated this week
- ☆80Updated last year
- LLM serving cluster simulator☆81Updated 6 months ago
- The documents for TVM Unity☆11Updated 3 months ago
- ☆84Updated 4 months ago
- Official Repo for "LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization"☆27Updated 8 months ago
- ☆52Updated last week
- ☆42Updated 7 months ago
- ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)☆17Updated last week
- Multi-branch model for concurrent execution☆16Updated last year
- ☆30Updated 4 months ago
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆131Updated last year
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆114Updated 2 years ago
- ☆41Updated 6 months ago
- ☆73Updated last year