jeho-lee / Awesome-On-Device-AI-Systems
☆33Updated last week
Alternatives and similar repositories for Awesome-On-Device-AI-Systems:
Users that are interested in Awesome-On-Device-AI-Systems are comparing it to the libraries listed below
- ☆19Updated last year
- zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices [MobiSys'21] - Artifact Evaluation☆25Updated 3 years ago
- Multi-DNN Inference Engine for Heterogeneous Mobile Processors☆32Updated 9 months ago
- LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale☆109Updated 2 months ago
- List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.☆83Updated 10 months ago
- LaLaRAND: Flexible Layer-by-Layer CPU/GPU Scheduling for Real-Time DNN Tasks☆13Updated 3 years ago
- This is a list of awesome edgeAI inference related papers.☆95Updated last year
- ☆138Updated 9 months ago
- ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference☆111Updated 2 months ago
- [HPCA'24] Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System☆42Updated last year
- A version of XRBench-MAESTRO used for MLSys 2023 publication☆23Updated last year
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆104Updated last week
- Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization (ISCA'24)☆14Updated 9 months ago
- NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing☆80Updated 10 months ago
- LLM Inference analyzer for different hardware platforms☆62Updated 2 weeks ago
- ☆104Updated last week
- [ACM EuroSys '23] Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access☆56Updated last year
- ☆45Updated 11 months ago
- Curated collection of papers in MoE model inference☆150Updated 2 months ago
- ☆95Updated last year
- The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]☆18Updated 2 years ago
- ☆55Updated last year
- ☆98Updated 6 months ago
- ☆200Updated last year
- ☆30Updated 2 months ago
- ☆64Updated 10 months ago
- ☆66Updated 3 weeks ago
- ☆77Updated last year
- Experimental deep learning framework written in Rust☆14Updated 2 years ago
- LLM serving cluster simulator☆97Updated 11 months ago