mlc-ai / binary-mlc-llm-libsLinks
☆248Updated last week
Alternatives and similar repositories for binary-mlc-llm-libs
Users that are interested in binary-mlc-llm-libs are comparing it to the libraries listed below
Sorting:
- llama.cpp tutorial on Android phone☆112Updated 2 months ago
- A mobile Implementation of llama.cpp☆312Updated last year
- MiniCPM on Android platform.☆635Updated 3 months ago
- IRIS is an android app for interfacing with GGUF / llama.cpp models locally.☆223Updated 5 months ago
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆127Updated 2 years ago
- automatically quant GGUF models☆185Updated last week
- ☆59Updated last year
- Falcon LLM ggml framework with CPU and GPU support☆246Updated last year
- [ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for edge devices☆650Updated 2 months ago
- 使用Android手机的CPU推理stable diffusion☆155Updated last year
- C++ implementation for 💫StarCoder☆455Updated last year
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆656Updated this week
- Python bindings for ggml☆142Updated 10 months ago
- High-speed and easy-use LLM serving framework for local deployment☆112Updated 3 months ago
- A mobile Implementation of llama.cpp☆25Updated last year
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆43Updated 9 months ago
- Train your own small bitnet model☆74Updated 8 months ago
- MiniCPM on iOS.☆68Updated 3 months ago
- LLM-based code completion engine☆193Updated 5 months ago
- Visual Studio Code extension for WizardCoder☆149Updated last year
- Port of Facebook's LLaMA model in C/C++☆97Updated last week
- LLM inference in C/C++☆78Updated 3 weeks ago
- Extension for using alternative GitHub Copilot (StarCoder API) in VSCode☆100Updated last year
- AI for all: Build the large graph of the language models☆270Updated last year
- llama.cpp fork used by GPT4All☆56Updated 4 months ago
- An innovative library for efficient LLM inference via low-bit quantization☆349Updated 10 months ago
- Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp☆154Updated 2 months ago
- ☆547Updated 8 months ago
- Efficient Inference of Transformer models☆439Updated 11 months ago
- AMD related optimizations for transformer models☆80Updated 3 weeks ago