NexaAI / Awesome-LLMs-on-device
Awesome LLMs on Device: A Comprehensive Survey
☆1,071Updated 3 months ago
Alternatives and similar repositories for Awesome-LLMs-on-device:
Users that are interested in Awesome-LLMs-on-device are comparing it to the libraries listed below
- Unified KV Cache Compression Methods for Auto-Regressive Models☆1,018Updated 3 months ago
- Fast Multimodal LLM on Mobile Devices☆830Updated last month
- Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.☆2,390Updated this week
- Align Anything: Training All-modality Model with Feedback☆3,448Updated this week
- An acceleration library that supports arbitrary bit-width combinatorial quantization operations☆222Updated 6 months ago
- [ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2☆199Updated last month
- adds Sequence Parallelism into LLaMA-Factory☆464Updated last week
- Recipes to train reward model for RLHF.☆1,296Updated 2 months ago
- A highly optimized LLM inference acceleration engine for Llama and its variants.☆885Updated last week
- minimal-cost for training 0.5B R1-Zero☆706Updated this week
- [NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which r…☆971Updated last week
- [EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a V…☆461Updated this week
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆696Updated last week
- Easiest and laziest way for building multi-agent LLMs applications.☆1,632Updated this week
- Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.☆1,197Updated last week
- [Up-to-date] Large Language Model Agent: A Survey on Methodology, Applications and Challenges☆483Updated last week
- Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS☆1,171Updated 3 weeks ago
- [NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models☆252Updated last month
- Low-bit LLM inference on CPU with lookup table☆735Updated 3 months ago
- Build multimodal language agents for fast prototype and production☆2,472Updated last month
- Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads☆2,503Updated 10 months ago
- Awesome Mobile LLMs☆169Updated last month
- Fast inference from large lauguage models via speculative decoding☆714Updated 8 months ago
- Eagle Family: Exploring Model Designs, Data Recipes and Training Strategies for Frontier-Class Multimodal LLMs☆744Updated this week
- ✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork☆202Updated 3 weeks ago
- DeepRetrieval - Hacking 🔥Real Search Engines and Retrievers with LLM via RL☆423Updated last week
- The official implementation of Self-Play Preference Optimization (SPPO)☆536Updated 3 months ago
- Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language m…☆4,507Updated last month
- Survey Paper List - Efficient LLM and Foundation Models☆246Updated 7 months ago
- Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。☆1,720Updated 3 months ago