NexaAI / Awesome-LLMs-on-deviceLinks

Awesome LLMs on Device: A Comprehensive Survey

☆1,288

Alternatives and similar repositories for Awesome-LLMs-on-device

Users that are interested in Awesome-LLMs-on-device are comparing it to the libraries listed below

Sorting:

Zefan-Cai / KVCache-Factory
Unified KV Cache Compression Methods for Auto-Regressive Models
☆1,288Updated 11 months ago
AIoT-MLSys-Lab / SVD-LLM
[ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2
☆270Updated 3 months ago
UbiquitousLearning / mllm
Fast Multimodal LLM on Mobile Devices
☆1,277Updated last week
zhihu / ZhiLight
A highly optimized LLM inference acceleration engine for Llama and its variants.
☆904Updated 5 months ago
Simple-Efficient / RL-Factory
Train your Agent model via our easy and efficient framework
☆1,658Updated 2 weeks ago
EvolvingLMMs-Lab / lmms-eval
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
☆3,387Updated this week
PKU-Alignment / align-anything
Align Anything: Training All-modality Model with Feedback
☆4,606Updated 3 weeks ago
luo-junyu / Awesome-Agent-Papers
[Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges
☆2,255Updated last month
microsoft / MInference
[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention…
☆1,166Updated 2 months ago
ChenmienTan / RL2
☆956Updated this week
FasterDecoding / Medusa
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
☆2,672Updated last year
bytedance / ABQ-LLM
An acceleration library that supports arbitrary bit-width combinatorial quantization operations
☆238Updated last year
AIoT-MLSys-Lab / Efficient-LLMs-Survey
[TMLR 2024] Efficient Large Language Models: A Survey
☆1,239Updated 5 months ago
dhcode-cpp / X-R1
minimal-cost for training 0.5B R1-Zero
☆792Updated 7 months ago
Qihoo360 / 360-LLaMA-Factory
adds Sequence Parallelism into LLaMA-Factory
☆599Updated 2 months ago
hahnyuan / LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…
☆599Updated last year
horseee / Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
☆1,915Updated 6 months ago
OS-Agent-Survey / OS-Agent-Survey
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
☆373Updated 4 months ago
MilkThink-Lab / RouterEval
A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in Large Language Models
☆101Updated last month
stevelaskaridis / awesome-mobile-llm
Awesome Mobile LLMs
☆282Updated 3 weeks ago
mit-han-lab / TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
☆932Updated last year
om-ai-lab / OmAgent
Build multimodal language agents for fast prototype and production
☆2,608Updated 9 months ago
HITsz-TMG / Uni-MoE
Uni-MoE: Lychee's Large Multimodal Model Family.
☆1,049Updated this week
uclaml / SPPO
The official implementation of Self-Play Preference Optimization (SPPO)
☆583Updated 10 months ago
feifeibear / LLMSpeculativeSampling
Fast inference from large lauguage models via speculative decoding
☆868Updated last year
SafeAILab / EAGLE
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
☆2,065Updated 3 weeks ago
RLHFlow / RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
☆1,488Updated 7 months ago
InternLM / InternBootcamp
☆331Updated 3 months ago
KodCode-AI / kodcode
✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork
☆297Updated 3 months ago
HKUDS / SepLLM
[ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator"
☆556Updated 4 months ago