LLM inference in C/C++
☆109,291May 9, 2026Updated this week
Alternatives and similar repositories for llama.cpp
Users that are interested in llama.cpp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Port of OpenAI's Whisper model in C/C++☆49,414May 2, 2026Updated last week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆79,733Updated this week
- Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.☆170,820May 6, 2026Updated last week
- Tensor library for machine learning☆14,594May 5, 2026Updated last week
- Inference code for Llama models☆59,404Jan 26, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Open-source desktop app for local LLMs. Text, vision, tool-calling, OpenAI/Anthropic-compatible API.☆46,978Updated this week
- Python bindings for llama.cpp☆10,285Updated this week
- GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.☆77,364May 27, 2025Updated 11 months ago
- The agent engineering platform. Available in TypeScript!☆136,191Updated this week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆39,471May 1, 2026Updated last week
- LlamaIndex is the leading document agent and OCR platform☆49,354Updated this week
- Inference Llama 2 in one file of pure C☆19,480Aug 6, 2024Updated last year
- Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.☆63,952Updated this week
- Code and documentation to train Stanford's Alpaca models, and generate the data.☆30,258Jul 17, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Universal LLM Deployment Engine with ML Compilation☆22,598Apr 22, 2026Updated 3 weeks ago
- User-friendly AI Interface (Supports Ollama, OpenAI API, ...)☆136,384Updated this week
- 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal model…☆160,288May 6, 2026Updated last week
- Distribute and run LLMs with a single file.☆24,410May 4, 2026Updated last week
- AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus o…☆184,200Updated this week
- Robust Speech Recognition via Large-Scale Weak Supervision☆99,039Apr 15, 2026Updated 3 weeks ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆42,281Updated this week
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆57,846Nov 12, 2025Updated 6 months ago
- Locally run an Instruction-Tuned Chat-Style LLM☆10,153Apr 19, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Instruct-tune LLaMA on consumer hardware☆18,937Jul 29, 2024Updated last year
- LLM training in simple, raw C/CUDA☆29,842Jun 26, 2025Updated 10 months ago
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)☆70,969May 3, 2026Updated last week
- Stable Diffusion web UI☆162,812Mar 2, 2026Updated 2 months ago
- The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.☆112,559Updated this week
- SGLang is a high-performance serving framework for large language models and multimodal models.☆27,516Updated this week
- MLX: An array framework for Apple silicon☆26,132Updated this week
- LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.☆46,205Updated this week
- Fast and memory-efficient exact attention☆23,736Updated this week
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Making large AI models cheaper, faster and more accessible☆41,380Updated this week
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamical…☆37,409Aug 17, 2024Updated last year
- A programming framework for agentic AI☆57,766Apr 15, 2026Updated 3 weeks ago
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆45,804May 6, 2026Updated last week
- Interact with your documents using the power of GPT, 100% privately, no data leaks☆57,209Feb 26, 2026Updated 2 months ago
- 🔊 Text-Prompted Generative Audio Model☆39,105Aug 19, 2024Updated last year
- Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!☆42,505May 5, 2026Updated last week