NexaAI / Awesome-LLMs-on-deviceLinks
Awesome LLMs on Device: A Comprehensive Survey
☆1,137Updated 5 months ago
Alternatives and similar repositories for Awesome-LLMs-on-device
Users that are interested in Awesome-LLMs-on-device are comparing it to the libraries listed below
Sorting:
- Unified KV Cache Compression Methods for Auto-Regressive Models☆1,166Updated 5 months ago
- Fast Multimodal LLM on Mobile Devices☆935Updated 2 weeks ago
- Survey Paper List - Efficient LLM and Foundation Models☆249Updated 9 months ago
- A highly optimized LLM inference acceleration engine for Llama and its variants.☆897Updated this week
- [ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2☆223Updated 3 months ago
- Awesome LLM compression research papers and tools.☆1,578Updated 2 weeks ago
- One for All Modalities Evaluation Toolkit - including text, image, video, audio tasks.☆2,691Updated this week
- An acceleration library that supports arbitrary bit-width combinatorial quantization operations☆225Updated 9 months ago
- Train your Agent model via our easy and efficient framework☆1,201Updated this week
- Awesome Mobile LLMs☆206Updated 3 weeks ago
- Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…☆491Updated 9 months ago
- The official implementation of Self-Play Preference Optimization (SPPO)☆567Updated 5 months ago
- [NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models☆258Updated 3 months ago
- [TMLR 2024] Efficient Large Language Models: A Survey☆1,176Updated last week
- 📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉☆4,172Updated last week
- Align Anything: Training All-modality Model with Feedback☆4,085Updated last month
- A curated list for Efficient Large Language Models☆1,757Updated 2 weeks ago
- [EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a V…☆503Updated this week
- [Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges☆1,015Updated last week
- This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).☆306Updated last week
- Recipes to train reward model for RLHF.☆1,393Updated 2 months ago
- Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language m…☆4,589Updated last week
- Awesome list for LLM quantization☆240Updated 3 weeks ago
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆811Updated last week
- Large Language Model (LLM) Systems Paper List☆1,329Updated last week
- minimal-cost for training 0.5B R1-Zero☆743Updated last month
- [MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Se…☆708Updated 3 months ago
- Build multimodal language agents for fast prototype and production☆2,518Updated 3 months ago
- Android ChatBot with Octopus v2 - Function Calling Demo☆15Updated 11 months ago
- 📰 Must-read papers on KV Cache Compression (constantly updating 🤗).☆463Updated this week