mlc-ai / binary-mlc-llm-libsLinks
☆252Updated last month
Alternatives and similar repositories for binary-mlc-llm-libs
Users that are interested in binary-mlc-llm-libs are comparing it to the libraries listed below
Sorting:
- llama.cpp tutorial on Android phone☆124Updated 3 months ago
- A mobile Implementation of llama.cpp☆315Updated last year
- IRIS is an android app for interfacing with GGUF / llama.cpp models locally.☆233Updated 6 months ago
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆129Updated 2 years ago
- automatically quant GGUF models☆196Updated this week
- Falcon LLM ggml framework with CPU and GPU support☆247Updated last year
- A mobile Implementation of llama.cpp☆25Updated last year
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆677Updated this week
- AMD related optimizations for transformer models☆82Updated this week
- C++ implementation for 💫StarCoder☆457Updated last year
- Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp☆158Updated 3 months ago
- 使用Android手机的CPU推理stable diffusion☆158Updated last year
- MiniCPM on Android platform.☆634Updated 5 months ago
- A multimodal, function calling powered LLM webui.☆216Updated 11 months ago
- WebAssembly (Wasm) Build and Bindings for llama.cpp☆278Updated last year
- An extension for oobabooga/text-generation-webui that enables the LLM to search the web☆263Updated 2 weeks ago
- Locally run an Instruction-Tuned Chat-Style LLM (Android/Linux/Windows/Mac)☆265Updated 2 years ago
- High-speed and easy-use LLM serving framework for local deployment☆116Updated 3 weeks ago
- ☆161Updated 2 weeks ago
- starcoder server for huggingface-vscdoe custom endpoint☆173Updated last year
- [ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for edge devices☆655Updated 3 months ago
- Visual Studio Code extension for WizardCoder☆149Updated 2 years ago
- An unsupervised model merging algorithm for Transformers-based language models.☆106Updated last year
- ggml implementation of BERT☆492Updated last year
- ☆306Updated this week
- Train your own small bitnet model☆75Updated 10 months ago
- Train llama with lora on one 4090 and merge weight of lora to work as stanford alpaca.☆52Updated 2 years ago
- Python bindings for ggml☆146Updated 11 months ago
- Inference on CPU code for LLaMA models☆137Updated 2 years ago
- 1.58-bit LLaMa model☆82Updated last year