tc-mb / llama.cpp
Port of Facebook's LLaMA model in C/C++
☆92Updated this week
Alternatives and similar repositories for llama.cpp:
Users that are interested in llama.cpp are comparing it to the libraries listed below
- 研究GOT-OCR-项目落地加速,不限语言☆59Updated 5 months ago
- GLM Series Edge Models☆131Updated last month
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆67Updated 8 months ago
- Port of Facebook's LLaMA model in C/C++☆33Updated 2 weeks ago
- ☆218Updated last month
- qwen2 and llama3 cpp implementation☆43Updated 9 months ago
- Get up and running with Llama 3, Mistral, Gemma, and other large language models.☆26Updated this week
- 基于MNN-llm的安卓手机部署大语言模型:Qwen1.5-0.5B-Chat☆70Updated 11 months ago
- ☆310Updated 3 months ago
- ☆27Updated last year
- SUS-Chat: Instruction tuning done right☆48Updated last year
- ☆59Updated 11 months ago
- run chatglm3-6b in BM1684X☆38Updated last year
- run ChatGLM2-6B in BM1684X☆49Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆20Updated this week
- ☆39Updated 5 months ago
- automatically quant GGUF models☆164Updated last week
- This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…☆223Updated 5 months ago
- A demo built on Megrez-3B-Instruct, integrating a web search tool to enhance the model's question-and-answer capabilities.☆37Updated 3 months ago
- MiniCPM on iOS.☆67Updated last week
- A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.☆171Updated 3 weeks ago
- ☆182Updated last month
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆240Updated 3 weeks ago
- TianMu: A modern AI tool with multi-platform support, markdown support, multimodal, continuous conversation, and customizable commands. 一…☆83Updated last year
- Demonstration of running a native LLM on Android device.☆126Updated this week
- Its an open source LLM based on MOE Structure.☆58Updated 8 months ago
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆54Updated 4 months ago
- MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)☆133Updated 2 months ago
- Chinese CLIP models with SOTA performance.☆54Updated last year
- LM inference server implementation based on *.cpp.☆154Updated this week