mlc-ai / binary-mlc-llm-libsLinks
☆247Updated last month
Alternatives and similar repositories for binary-mlc-llm-libs
Users that are interested in binary-mlc-llm-libs are comparing it to the libraries listed below
Sorting:
- A mobile Implementation of llama.cpp☆312Updated last year
- llama.cpp tutorial on Android phone☆110Updated last month
- IRIS is an android app for interfacing with GGUF / llama.cpp models locally.☆216Updated 4 months ago
- 使用Android手机的CPU推理stable diffusion☆152Updated last year
- 参考自mlc-llm,个人尝试在android手机上部署大模型 并运行☆86Updated 10 months ago
- ☆158Updated last week
- ☆59Updated last year
- ☆541Updated 7 months ago
- On-device LLM Inference Powered by X-Bit Quantization☆249Updated 2 weeks ago
- GPTQ inference Triton kernel☆302Updated 2 years ago
- automatically quant GGUF models☆184Updated last week
- a lightweight LLM model inference framework☆730Updated last year
- Train your own small bitnet model☆72Updated 8 months ago
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆126Updated 2 years ago
- Offline voice input panel & keyboard with punctuation for Android.☆105Updated last year
- Local ML voice chat using high-end models.☆172Updated 2 weeks ago
- Awesome Mobile LLMs☆204Updated 3 weeks ago
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,529Updated 3 months ago
- C++ implementation for 💫StarCoder☆453Updated last year
- Locally run an Instruction-Tuned Chat-Style LLM (Android/Linux/Windows/Mac)☆263Updated 2 years ago
- 基于MNN-llm的安卓手机部署大语言模型:Qwen1.5-0.5B-Chat☆79Updated last year
- [ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models☆273Updated last month
- An unsupervised model merging algorithm for Transformers-based language models.☆105Updated last year
- llama.cpp fork used by GPT4All☆55Updated 4 months ago
- Instructions for installing Open Interpreter on your Android device.☆225Updated last year
- C++ implementation of Qwen-LM☆594Updated 6 months ago
- A mobile Implementation of llama.cpp☆25Updated last year
- ☆118Updated last year
- Falcon LLM ggml framework with CPU and GPU support☆246Updated last year
- Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.☆2,046Updated 3 weeks ago