mlc-ai / binary-mlc-llm-libsLinks

☆266

Alternatives and similar repositories for binary-mlc-llm-libs

Users that are interested in binary-mlc-llm-libs are comparing it to the libraries listed below

Sorting:

Bip-Rep / sherpa
A mobile Implementation of llama.cpp
☆322Updated last year
JackZeng0208 / llama.cpp-android-tutorial
llama.cpp tutorial on Android phone
☆137Updated 7 months ago
nerve-sparks / iris_android
IRIS is an android app for interfacing with GGUF / llama.cpp models locally.
☆252Updated 10 months ago
nuance1979 / llama-server
LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
☆130Updated 2 years ago
dsd / sherpa
A mobile Implementation of llama.cpp
☆26Updated 2 years ago
Achazwl / mlc
MiniCPM on Android platform.
☆634Updated 8 months ago
akx / ggify
Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp
☆163Updated 7 months ago
mbzuai-oryx / MobiLlama
[ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for edge devices
☆667Updated 6 months ago
powerserve-project / PowerServe
High-speed and easy-use LLM serving framework for local deployment
☆137Updated 4 months ago
ZTMIDGO / Android-Stable-diffusion-ONNX
使用Android手机的CPU推理stable diffusion
☆159Updated 2 years ago
UbiquitousLearning / PhoneLM
☆65Updated last year
leafspark / AutoGGUF
automatically quant GGUF models
☆219Updated last month
Ac1drainn / OllamaDroid
A Ollama client for Android!
☆88Updated last year
cmp-nct / ggllm.cpp
Falcon LLM ggml framework with CPU and GPU support
☆248Updated last year
nomic-ai / llama.cpp
llama.cpp fork used by GPT4All
☆55Updated 9 months ago
janhq / cortex.tensorrt-llm
Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…
☆42Updated last year
Picovoice / picollm
On-device LLM Inference Powered by X-Bit Quantization
☆272Updated 2 weeks ago
ggml-org / p1
LLM-based code completion engine
☆190Updated 10 months ago
shubham0204 / SmolChat-Android
Running any GGUF SLMs/LLMs locally, on-device in Android
☆588Updated 3 weeks ago
rafacelente / bllama
1.58-bit LLaMa model
☆83Updated last year
TheBlokeAI / AIScripts
Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub
☆161Updated 2 years ago
intel / neural-speed
An innovative library for efficient LLM inference via low-bit quantization
☆350Updated last year
pranavjad / tinyllama-bitnet
Train your own small bitnet model
☆75Updated last year
bigcode-project / starcoder.cpp
C++ implementation for 💫StarCoder
☆457Updated 2 years ago
BrutalCoding / shady.ai
Making offline AI models accessible to all types of edge devices.
☆144Updated last year
Gryphe / MergeMonster
An unsupervised model merging algorithm for Transformers-based language models.
☆108Updated last year
Lisoveliy / StarCoderEx
Extension for using alternative GitHub Copilot (StarCoder API) in VSCode
☆100Updated last year
mzbac / wizardCoder-vsc
Visual Studio Code extension for WizardCoder
☆149Updated 2 years ago
AUGMXNT / deccp
Evaling and unaligning Chinese LLM censorship
☆70Updated 7 months ago
NexaAI / octopus-v4
AI for all: Build the large graph of the language models
☆277Updated last year