mlc-ai / binary-mlc-llm-libsLinks
☆291Updated last week
Alternatives and similar repositories for binary-mlc-llm-libs
Users that are interested in binary-mlc-llm-libs are comparing it to the libraries listed below
Sorting:
- llama.cpp tutorial on Android phone☆148Updated 9 months ago
- A mobile Implementation of llama.cpp☆326Updated 2 years ago
- IRIS is an android app for interfacing with GGUF / llama.cpp models locally.☆266Updated last year
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆130Updated 2 years ago
- A mobile Implementation of llama.cpp☆26Updated 2 years ago
- MiniCPM on Android platform.☆636Updated 10 months ago
- 使用Android手机的CPU推理stable diffusion☆158Updated 2 years ago
- High-speed and easy-use LLM serving framework for local deployment☆146Updated 6 months ago
- On-device LLM Inference Powered by X-Bit Quantization☆278Updated 2 weeks ago
- Falcon LLM ggml framework with CPU and GPU support☆249Updated 2 years ago
- [ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for edge devices☆667Updated 9 months ago
- AMD related optimizations for transformer models☆97Updated 3 months ago
- C++ implementation for 💫StarCoder☆459Updated 2 years ago
- ggml implementation of BERT☆498Updated last year
- Inference on CPU code for LLaMA models☆137Updated 2 years ago
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆798Updated last week
- LLM-based code completion engine☆190Updated last year
- Python bindings for ggml☆147Updated last year
- An innovative library for efficient LLM inference via low-bit quantization☆352Updated last year
- automatically quant GGUF models☆219Updated last month
- llama.cpp fork used by GPT4All☆55Updated 11 months ago
- WebAssembly (Wasm) Build and Bindings for llama.cpp☆285Updated last year
- Train your own small bitnet model☆77Updated last year
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆42Updated last year
- Running any GGUF SLMs/LLMs locally, on-device in Android☆666Updated last month
- The Qualcomm® AI Hub apps are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) a…☆369Updated 2 weeks ago
- Visual Studio Code extension for WizardCoder☆149Updated 2 years ago
- Extension for using alternative GitHub Copilot (StarCoder API) in VSCode☆100Updated last year
- MiniCPM on iOS.☆67Updated 10 months ago
- ☆577Updated last year