mlc-ai / binary-mlc-llm-libsLinks
☆277Updated last month
Alternatives and similar repositories for binary-mlc-llm-libs
Users that are interested in binary-mlc-llm-libs are comparing it to the libraries listed below
Sorting:
- llama.cpp tutorial on Android phone☆138Updated 7 months ago
- A mobile Implementation of llama.cpp☆323Updated last year
- IRIS is an android app for interfacing with GGUF / llama.cpp models locally.☆257Updated 10 months ago
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆130Updated 2 years ago
- MiniCPM on Android platform.☆634Updated 9 months ago
- A mobile Implementation of llama.cpp☆26Updated 2 years ago
- [ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for edge devices☆668Updated 7 months ago
- 使用Android手机的CPU推理stable diffusion☆157Updated 2 years ago
- Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp☆166Updated 7 months ago
- Running any GGUF SLMs/LLMs locally, on-device in Android☆615Updated 3 weeks ago
- C++ implementation for 💫StarCoder☆457Updated 2 years ago
- High-speed and easy-use LLM serving framework for local deployment☆139Updated 4 months ago
- automatically quant GGUF models☆218Updated this week
- On-device LLM Inference Powered by X-Bit Quantization☆273Updated last week
- Train your own small bitnet model☆76Updated last year
- LLM-based code completion engine☆190Updated 11 months ago
- Falcon LLM ggml framework with CPU and GPU support☆248Updated last year
- Extension for using alternative GitHub Copilot (StarCoder API) in VSCode☆100Updated last year
- ☆574Updated last year
- Locally run an Instruction-Tuned Chat-Style LLM (Android/Linux/Windows/Mac)☆263Updated 2 years ago
- AMD related optimizations for transformer models☆96Updated 2 months ago
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,554Updated 9 months ago
- VSCode AI coding assistant powered by self-hosted llama.cpp endpoint.☆183Updated 10 months ago
- llama.cpp fork used by GPT4All☆55Updated 10 months ago
- Inference on CPU code for LLaMA models☆137Updated 2 years ago
- MiniCPM on iOS.☆67Updated 9 months ago
- An innovative library for efficient LLM inference via low-bit quantization☆351Updated last year
- 💬 Chatbot web app + HTTP and Websocket endpoints for LLM inference with the Petals client☆316Updated last year
- ☆164Updated 4 months ago
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆350Updated last week