mlc-ai / binary-mlc-llm-libsLinks
☆249Updated last month
Alternatives and similar repositories for binary-mlc-llm-libs
Users that are interested in binary-mlc-llm-libs are comparing it to the libraries listed below
Sorting:
- llama.cpp tutorial on Android phone☆120Updated 3 months ago
- A mobile Implementation of llama.cpp☆314Updated last year
- IRIS is an android app for interfacing with GGUF / llama.cpp models locally.☆225Updated 6 months ago
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆128Updated 2 years ago
- Falcon LLM ggml framework with CPU and GPU support☆246Updated last year
- MiniCPM on Android platform.☆636Updated 4 months ago
- 使用Android手机的CPU推理stable diffusion☆156Updated last year
- [ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for edge devices☆653Updated 2 months ago
- automatically quant GGUF models☆188Updated last week
- A mobile Implementation of llama.cpp☆25Updated last year
- Locally run an Instruction-Tuned Chat-Style LLM (Android/Linux/Windows/Mac)☆265Updated 2 years ago
- A multimodal, function calling powered LLM webui.☆215Updated 10 months ago
- C++ implementation for 💫StarCoder☆456Updated last year
- Extension for using alternative GitHub Copilot (StarCoder API) in VSCode☆100Updated last year
- Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp☆157Updated 3 months ago
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆252Updated this week
- On-device LLM Inference Powered by X-Bit Quantization☆260Updated 2 weeks ago
- ggml implementation of BERT☆495Updated last year
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆668Updated this week
- ☆157Updated last year
- High-speed and easy-use LLM serving framework for local deployment☆115Updated 4 months ago
- MiniCPM on iOS.☆68Updated 4 months ago
- ☆549Updated 9 months ago
- Making offline AI models accessible to all types of edge devices.☆142Updated last year
- WebAssembly (Wasm) Build and Bindings for llama.cpp☆273Updated last year
- This reference can be used with any existing OpenAI integrated apps to run with TRT-LLM inference locally on GeForce GPU on Windows inste…☆126Updated last year
- Train your own small bitnet model☆75Updated 9 months ago
- Download models from the Ollama library, without Ollama☆90Updated 8 months ago
- LLM-based code completion engine☆193Updated 6 months ago
- llama.cpp fork used by GPT4All☆56Updated 5 months ago