NexaAI / Awesome-LLMs-on-device
Awesome LLMs on Device: A Comprehensive Survey
☆931Updated last month
Related projects ⓘ
Alternatives and complementary repositories for Awesome-LLMs-on-device
- Unified KV Cache Compression Methods for LLMs☆728Updated this week
- An acceleration library that supports arbitrary bit-width combinatorial quantization operations☆226Updated last month
- Accelerating the development of large multimodal models (LMMs) with lmms-eval☆2,068Updated this week
- The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"☆770Updated 2 months ago
- The official implementation of Self-Play Preference Optimization (SPPO)☆498Updated 3 months ago
- A Multimodal Native Agent Framework for Smart Hardware and More☆1,286Updated this week
- A recipe for online RLHF and online iterative DPO.☆434Updated last week
- Recipes to train reward model for RLHF.☆903Updated this week
- [COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding☆230Updated 2 months ago
- Large Reasoning Models☆580Updated this week
- Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。☆1,565Updated 2 weeks ago
- [NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models☆214Updated last month
- Easiest and laziest way for building multi-agent LLMs applications.☆1,021Updated this week
- OLMoE: Open Mixture-of-Experts Language Models☆460Updated this week
- improve Llama-2's proficiency in comprehension, generation, and translation of Chinese.☆532Updated 7 months ago
- An MBTI Exploration of Large Language Models☆474Updated 9 months ago
- Industrial-first evaluation benchmark for LLMs in the DevOps/AIOps domain.☆686Updated 4 months ago
- Fast Multimodal LLM on Mobile Devices☆532Updated this week
- DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads☆373Updated 3 weeks ago
- Fast inference from large lauguage models via speculative decoding☆569Updated 2 months ago
- EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders☆539Updated 2 months ago
- "AnyGraph: Graph Foundation Model in the Wild"☆185Updated 2 months ago
- Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language m…☆3,896Updated this week
- The Official Repo of ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code (https://a…☆355Updated this week
- FuseAI Project☆452Updated 3 months ago
- A tutorial based on MetaGPT to quickly help you understand the concept of agent and muti-agent and get started with coding development. 基…☆1,369Updated 6 months ago
- A deployment, monitoring and autoscaling service towards serverless LLM serving.☆162Updated last week
- [NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces in…☆791Updated this week
- ☆368Updated 6 months ago
- [ICML 2024] CLLMs: Consistency Large Language Models☆353Updated this week