NexaAI / Awesome-LLMs-on-deviceLinks
Awesome LLMs on Device: A Comprehensive Survey
β1,288Updated 11 months ago
Alternatives and similar repositories for Awesome-LLMs-on-device
Users that are interested in Awesome-LLMs-on-device are comparing it to the libraries listed below
Sorting:
- Unified KV Cache Compression Methods for Auto-Regressive Modelsβ1,288Updated 11 months ago
- [ICLR 2025π₯] SVD-LLM & [NAACL 2025π₯] SVD-LLM V2β270Updated 3 months ago
- Fast Multimodal LLM on Mobile Devicesβ1,277Updated last week
- A highly optimized LLM inference acceleration engine for Llama and its variants.β904Updated 5 months ago
- Train your Agent model via our easy and efficient frameworkβ1,658Updated 2 weeks ago
- One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasksβ3,387Updated this week
- Align Anything: Training All-modality Model with Feedbackβ4,606Updated 3 weeks ago
- [Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challengesβ2,255Updated last month
- [NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attentionβ¦β1,166Updated 2 months ago
- β956Updated this week
- Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Headsβ2,672Updated last year
- An acceleration library that supports arbitrary bit-width combinatorial quantization operationsβ238Updated last year
- [TMLR 2024] Efficient Large Language Models: A Surveyβ1,239Updated 5 months ago
- minimal-cost for training 0.5B R1-Zeroβ792Updated 7 months ago
- adds Sequence Parallelism into LLaMA-Factoryβ599Updated 2 months ago
- Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline modβ¦β599Updated last year
- A curated list for Efficient Large Language Modelsβ1,915Updated 6 months ago
- This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).β373Updated 4 months ago
- A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in Large Language Modelsβ101Updated last month
- Awesome Mobile LLMsβ282Updated 3 weeks ago
- TinyChatEngine: On-Device LLM Inference Libraryβ932Updated last year
- Build multimodal language agents for fast prototype and productionβ2,608Updated 9 months ago
- Uni-MoE: Lychee's Large Multimodal Model Family.β1,049Updated this week
- The official implementation of Self-Play Preference Optimization (SPPO)β583Updated 10 months ago
- Fast inference from large lauguage models via speculative decodingβ868Updated last year
- Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).β2,065Updated 3 weeks ago
- Recipes to train reward model for RLHF.β1,488Updated 7 months ago
- β331Updated 3 months ago
- β¨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framworkβ297Updated 3 months ago
- [ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator"β556Updated 4 months ago