NexaAI / Awesome-LLMs-on-deviceLinks
Awesome LLMs on Device: A Comprehensive Survey
☆1,216Updated 8 months ago
Alternatives and similar repositories for Awesome-LLMs-on-device
Users that are interested in Awesome-LLMs-on-device are comparing it to the libraries listed below
Sorting:
- Unified KV Cache Compression Methods for Auto-Regressive Models☆1,249Updated 9 months ago
- Fast Multimodal LLM on Mobile Devices☆1,086Updated last week
- [ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2☆253Updated last month
- A highly optimized LLM inference acceleration engine for Llama and its variants.☆900Updated 2 months ago
- An acceleration library that supports arbitrary bit-width combinatorial quantization operations☆234Updated last year
- Train your Agent model via our easy and efficient framework☆1,532Updated last week
- Align Anything: Training All-modality Model with Feedback☆4,557Updated last month
- ☆870Updated last week
- One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks☆3,124Updated last week
- TVM Documentation in Chinese Simplified / TVM 中文文档☆2,398Updated this week
- adds Sequence Parallelism into LLaMA-Factory☆564Updated last week
- The official implementation of Self-Play Preference Optimization (SPPO)☆583Updated 8 months ago
- Awesome Mobile LLMs☆247Updated 2 weeks ago
- Awesome LLM compression research papers and tools.☆1,671Updated 3 months ago
- [NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention…☆1,133Updated this week
- Low-bit LLM inference on CPU/NPU with lookup table☆862Updated 4 months ago
- Build multimodal language agents for fast prototype and production☆2,554Updated 6 months ago
- [Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges☆1,754Updated last month
- [ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator"☆551Updated 2 months ago
- Fast inference from large lauguage models via speculative decoding☆829Updated last year
- Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…☆559Updated last year
- minimal-cost for training 0.5B R1-Zero☆771Updated 4 months ago
- A powerful toolkit for compressing large models including LLM, VLM, and video generation models.☆576Updated last month
- 📰 Must-read papers on KV Cache Compression (constantly updating 🤗).☆546Updated this week
- A curated list for Efficient Large Language Models☆1,873Updated 3 months ago
- Run the latest LLMs and VLMs across GPU, NPU, and CPU with bindings for Python, Android Java, and iOS Swift, getting up and running quick…☆5,007Updated last week
- The repository of Uni-MoE model series☆768Updated 2 weeks ago
- ❓Curie: Automated and Rigorous Scientific Experimentation with AI Agents☆288Updated last week
- Recipes to train reward model for RLHF.☆1,455Updated 5 months ago
- [TMLR 2024] Efficient Large Language Models: A Survey☆1,219Updated 3 months ago