tc-mb / llama.cpp

Port of Facebook's LLaMA model in C/C++

☆92

Alternatives and similar repositories for llama.cpp:

Users that are interested in llama.cpp are comparing it to the libraries listed below

1694439208 / GOT-OCR-Inference
研究GOT-OCR-项目落地加速，不限语言
☆59Updated 5 months ago
THUDM / GLM-Edge
GLM Series Edge Models
☆131Updated last month
OpenBMB / MobileCPM
A Toolkit for Running On-device Large Language Models (LLMs) in APP
☆67Updated 8 months ago
HimariO / llama.cpp.qwen2vl
Port of Facebook's LLaMA model in C/C++
☆33Updated 2 weeks ago
infinigence / Infini-Megrez-Omni
☆218Updated last month
yvonwin / qwen2.cpp
qwen2 and llama3 cpp implementation
☆43Updated 9 months ago
tc-mb / ollama
Get up and running with Llama 3, Mistral, Gemma, and other large language models.
☆26Updated this week
DataXujing / Qwen1.5-0.5b-chat-android
基于MNN-llm的安卓手机部署大语言模型：Qwen1.5-0.5B-Chat
☆70Updated 11 months ago
infinigence / Infini-Megrez
☆310Updated 3 months ago
Oneflow-Inc / diffusers
☆27Updated last year
SUSTech-IDEA / SUS-Chat
SUS-Chat: Instruction tuning done right
☆48Updated last year
microsoft / WizardLM2
☆59Updated 11 months ago
sophgo / ChatGLM3-TPU
run chatglm3-6b in BM1684X
☆38Updated last year
sophgo / ChatGLM2-TPU
run ChatGLM2-6B in BM1684X
☆49Updated last year
fyabc / vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆20Updated this week
Tlntin / qwen-ascend-llm
☆39Updated 5 months ago
leafspark / AutoGGUF
automatically quant GGUF models
☆164Updated last week
OpenBMB / MiniCPM-CookBook
This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…
☆223Updated 5 months ago
infinigence / InfiniWebSearch
A demo built on Megrez-3B-Instruct, integrating a web search tool to enhance the model's question-and-answer capabilities.
☆37Updated 3 months ago
zkh2016 / LLMFarm
MiniCPM on iOS.
☆67Updated last week
lucasjinreal / Namo-R1
A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.
☆171Updated 3 weeks ago
thunlp / LLMxMapReduce
☆182Updated last month
modelscope / dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …
☆240Updated 3 weeks ago
OpenJarvisAI / TianMu
TianMu: A modern AI tool with multi-platform support, markdown support, multimodal, continuous conversation, and customizable commands. 一…
☆83Updated last year
DakeQQ / Native-LLM-for-Android
Demonstration of running a native LLM on Android device.
☆126Updated this week
shootime2021 / APUS-xDAN-4.0-moe
Its an open source LLM based on MOE Structure.
☆58Updated 8 months ago
thunlp / Delta-CoMe
Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024
☆54Updated 4 months ago
mulanai / MuLan
MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)
☆133Updated 2 months ago
TencentARC-QQ / QA-CLIP
Chinese CLIP models with SOTA performance.
☆54Updated last year
gpustack / llama-box
LM inference server implementation based on *.cpp.
☆154Updated this week