mlc-ai / binary-mlc-llm-libsLinks
☆242Updated last month
Alternatives and similar repositories for binary-mlc-llm-libs
Users that are interested in binary-mlc-llm-libs are comparing it to the libraries listed below
Sorting:
- A mobile Implementation of llama.cpp☆311Updated last year
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆123Updated last year
- MiniCPM on Android platform.☆631Updated 2 months ago
- An innovative library for efficient LLM inference via low-bit quantization☆348Updated 9 months ago
- llama.cpp tutorial on Android phone☆102Updated last month
- ☆119Updated last year
- ggml implementation of BERT☆492Updated last year
- On-device LLM Inference Powered by X-Bit Quantization☆245Updated this week
- Local ML voice chat using high-end models.☆167Updated last week
- 参考自mlc-llm,个人尝试在android手机上部署大模型并运行☆86Updated 10 months ago
- IRIS is an android app for interfacing with GGUF / llama.cpp models locally.☆207Updated 4 months ago
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,523Updated 2 months ago
- Python bindings for ggml☆141Updated 9 months ago
- C++ implementation for 💫StarCoder☆452Updated last year
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆614Updated last week
- Web UI for ExLlamaV2☆495Updated 4 months ago
- A mobile Implementation of llama.cpp☆25Updated last year
- ☆157Updated this week
- ☆157Updated 10 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆130Updated 11 months ago
- GPTQ inference Triton kernel☆300Updated 2 years ago
- 使用Android手机的CPU推理stable diffusion☆152Updated last year
- ☆536Updated 7 months ago
- ☆59Updated last year
- Falcon LLM ggml framework with CPU and GPU support☆245Updated last year
- This reference can be used with any existing OpenAI integrated apps to run with TRT-LLM inference locally on GeForce GPU on Windows inste…☆121Updated last year
- Demonstration of running a native LLM on Android device.☆142Updated last week
- LLM-based code completion engine☆188Updated 4 months ago
- A multimodal, function calling powered LLM webui.☆214Updated 8 months ago
- automatically quant GGUF models☆181Updated this week