stevelaskaridis / awesome-mobile-llmLinks
Awesome Mobile LLMs
☆267Updated 3 weeks ago
Alternatives and similar repositories for awesome-mobile-llm
Users that are interested in awesome-mobile-llm are comparing it to the libraries listed below
Sorting:
- High-speed and easy-use LLM serving framework for local deployment☆130Updated 3 months ago
- Fast Multimodal LLM on Mobile Devices☆1,156Updated this week
- ☆63Updated 11 months ago
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆68Updated last year
- TinyChatEngine: On-Device LLM Inference Library☆909Updated last year
- 1.58 Bit LLM on Apple Silicon using MLX☆225Updated last year
- An innovative library for efficient LLM inference via low-bit quantization☆349Updated last year
- Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU.☆690Updated this week
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆344Updated 6 months ago
- VPTQ, A Flexible and Extreme low-bit quantization algorithm☆661Updated 6 months ago
- Low-bit LLM inference on CPU/NPU with lookup table☆884Updated 5 months ago
- This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai☆93Updated this week
- ☆41Updated 7 months ago
- Awesome list for LLM quantization☆334Updated 3 weeks ago
- A family of compressed models obtained via pruning and knowledge distillation☆355Updated 11 months ago
- Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model☆250Updated 5 months ago
- ☆456Updated this week
- ☆218Updated 9 months ago
- Efficient LLM Inference over Long Sequences☆390Updated 4 months ago
- KV cache compression for high-throughput LLM inference☆144Updated 9 months ago
- [ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for edge devices☆664Updated 5 months ago
- [ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models☆309Updated 5 months ago
- A collection of all available inference solutions for the LLMs☆91Updated 8 months ago
- ☆98Updated last year
- LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU vi…☆860Updated this week
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆258Updated this week
- ☆90Updated 3 weeks ago
- On-device LLM Inference Powered by X-Bit Quantization☆272Updated 3 months ago
- 🤗 Optimum ExecuTorch☆77Updated last week
- LLM Inference on consumer devices☆125Updated 7 months ago