mlc-ai / binary-mlc-llm-libs
☆236Updated 5 months ago
Alternatives and similar repositories for binary-mlc-llm-libs:
Users that are interested in binary-mlc-llm-libs are comparing it to the libraries listed below
- llama.cpp tutorial on Android phone☆98Updated 2 weeks ago
- A mobile Implementation of llama.cpp☆308Updated last year
- IRIS is an android app for interfacing with GGUF / llama.cpp models locally.☆192Updated 2 months ago
- A Ollama client for Android!☆83Updated 11 months ago
- On-device LLM Inference Powered by X-Bit Quantization☆234Updated 2 weeks ago
- 使用Android手机的CPU推理stable diffusion☆151Updated last year
- automatically quant GGUF models☆168Updated this week
- MobiLlama : Small Language Model tailored for edge devices☆632Updated last year
- A set of bash scripts to automate deployment of GGML/GGUF models [default: RWKV] with the use of KoboldCpp on Android - Termux☆41Updated 10 months ago
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆122Updated last year
- Port of Suno AI's Bark in C/C++ for fast inference☆52Updated last year
- Running any GGUF SLMs/LLMs locally, on-device in Android☆263Updated this week
- MiniCPM on Android platform.☆629Updated last month
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,509Updated last month
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆573Updated this week
- 使用Android cpu 运行 RWKV V4 ONNX☆70Updated last year
- An unsupervised model merging algorithm for Transformers-based language models.☆105Updated 11 months ago
- MiniCPM on iOS.☆68Updated last month
- A mobile Implementation of llama.cpp☆25Updated last year
- C++ implementation for 💫StarCoder☆453Updated last year
- A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependenci…☆310Updated last year
- llama.cpp fork with additional SOTA quants and improved performance☆292Updated this week
- llama.cpp fork used by GPT4All☆55Updated 2 months ago
- AI for all: Build the large graph of the language models☆263Updated 10 months ago
- ☆122Updated 9 months ago
- Implementation of the RWKV language model in pure WebGPU/Rust.☆298Updated last week
- EfficientQAT: Efficient Quantization-Aware Training for Large Language Models☆263Updated 6 months ago
- 参考自mlc-llm,个人尝试在android手机上部署大模型并运行☆86Updated 8 months ago
- ☆156Updated 3 weeks ago
- Android JNI for port of Facebook's LLaMA model in C/C++☆22Updated last year