mlc-ai / binary-mlc-llm-libsLinks
☆266Updated 2 weeks ago
Alternatives and similar repositories for binary-mlc-llm-libs
Users that are interested in binary-mlc-llm-libs are comparing it to the libraries listed below
Sorting:
- A mobile Implementation of llama.cpp☆322Updated last year
- llama.cpp tutorial on Android phone☆137Updated 7 months ago
- IRIS is an android app for interfacing with GGUF / llama.cpp models locally.☆252Updated 10 months ago
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆130Updated 2 years ago
- A mobile Implementation of llama.cpp☆26Updated 2 years ago
- MiniCPM on Android platform.☆634Updated 8 months ago
- Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp☆163Updated 7 months ago
- [ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for edge devices☆667Updated 6 months ago
- High-speed and easy-use LLM serving framework for local deployment☆137Updated 4 months ago
- 使用Android手机的CPU推理stable diffusion☆159Updated 2 years ago
- ☆65Updated last year
- automatically quant GGUF models☆219Updated last month
- A Ollama client for Android!☆88Updated last year
- Falcon LLM ggml framework with CPU and GPU support☆248Updated last year
- llama.cpp fork used by GPT4All☆55Updated 9 months ago
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆42Updated last year
- On-device LLM Inference Powered by X-Bit Quantization☆272Updated 2 weeks ago
- LLM-based code completion engine☆190Updated 10 months ago
- Running any GGUF SLMs/LLMs locally, on-device in Android☆588Updated 3 weeks ago
- 1.58-bit LLaMa model☆83Updated last year
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆161Updated 2 years ago
- An innovative library for efficient LLM inference via low-bit quantization☆350Updated last year
- Train your own small bitnet model☆75Updated last year
- C++ implementation for 💫StarCoder☆457Updated 2 years ago
- Making offline AI models accessible to all types of edge devices.☆144Updated last year
- An unsupervised model merging algorithm for Transformers-based language models.☆108Updated last year
- Extension for using alternative GitHub Copilot (StarCoder API) in VSCode☆100Updated last year
- Visual Studio Code extension for WizardCoder☆149Updated 2 years ago
- Evaling and unaligning Chinese LLM censorship☆70Updated 7 months ago
- AI for all: Build the large graph of the language models☆277Updated last year