Fast Multimodal LLM on Mobile Devices
☆1,552Jun 9, 2026Updated 3 weeks ago
Alternatives and similar repositories for mllm
Users that are interested in mllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆67Nov 16, 2024Updated last year
- ☆43Mar 29, 2025Updated last year
- One-size-fits-all model for mobile AI, a novel paradigm for mobile AI in which the OS and hardware co-manage a foundation model that is c…☆30Mar 5, 2024Updated 2 years ago
- Survey Paper List - Efficient LLM and Foundation Models☆264Sep 22, 2024Updated last year
- Low-bit LLM inference on CPU/NPU with lookup table☆966Jun 5, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- the original reference implementation of a specified llama.cpp backend for Qualcomm Hexagon NPU on Android phone, history of ggml-hexagon…☆45Updated this week
- ☆215Jan 17, 2024Updated 2 years ago
- ☆164Jun 21, 2026Updated last week
- LLM inference in C/C++☆53Jun 22, 2026Updated last week
- Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK☆94Jun 8, 2026Updated 3 weeks ago
- [MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Se…☆845Mar 6, 2025Updated last year
- Let's use Qualcomm NPU in Android☆20Feb 18, 2025Updated last year
- Self-implemented NN operators for Qualcomm's Hexagon NPU☆75Sep 30, 2025Updated 9 months ago
- A demo of end-to-end federated learning system.☆69Jun 1, 2022Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Universal LLM Deployment Engine with ML Compilation☆22,863May 11, 2026Updated last month
- 📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉☆5,355Jun 23, 2026Updated last week
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆429Updated this week
- Paper list for Personal LLM Agents☆430Updated this week
- llm deploy project based mnn. This project has merged into MNN.☆1,617Jan 20, 2025Updated last year
- ☆103Jan 17, 2024Updated 2 years ago
- High-speed and easy-use LLM serving framework for local deployment☆155Aug 7, 2025Updated 10 months ago
- Strong and Open Vision Language Assistant for Mobile Devices☆1,359Apr 15, 2024Updated 2 years ago
- Awesome Mobile LLMs☆361May 31, 2026Updated last month
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆68Sep 22, 2024Updated last year
- Demonstration of running a native LLM on Android device.☆257Updated this week
- Our unique contributions are in tools/train/benchmark.☆22Apr 14, 2025Updated last year
- On-device AI across mobile, embedded and edge for PyTorch☆4,766Updated this week
- MobiSys#114☆23Aug 17, 2023Updated 2 years ago
- The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]☆19Aug 4, 2022Updated 3 years ago
- LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task Automation☆70Aug 9, 2024Updated last year
- Qualcomm® AI Hub Models is our collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) an…☆1,133Updated this week
- High-speed Large Language Model Serving for Local Deployment☆9,586May 11, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- mnn asr demo.☆27Mar 24, 2025Updated last year
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration☆3,578Jul 17, 2025Updated 11 months ago
- TinyChatEngine: On-Device LLM Inference Library☆956Jul 4, 2024Updated last year
- LLM inference in C/C++☆21Oct 22, 2025Updated 8 months ago
- llm-export can export llm model to onnx.☆352May 8, 2026Updated last month
- MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering high-performance on-device LLMs and Edge AI.☆15,556Updated this week
- Multi-DNN Inference Engine for Heterogeneous Mobile Processors☆39Jul 24, 2024Updated last year