ggerganov / llama.cpp
LLM inference in C/C++
☆69,013Updated this week
Alternatives and similar repositories for llama.cpp:
Users that are interested in llama.cpp are comparing it to the libraries listed below
- Python bindings for llama.cpp☆8,264Updated this week
- Tensor library for machine learning☆11,337Updated this week
- A Gradio web UI for Large Language Models.☆41,032Updated this week
- Inference code for Llama models☆56,731Updated 3 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆31,618Updated this week
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆37,203Updated this week
- Port of OpenAI's Whisper model in C/C++☆36,103Updated this week
- Instruct-tune LLaMA on consumer hardware☆18,696Updated 4 months ago
- Code and documentation to train Stanford's Alpaca models, and generate the data.☆29,628Updated 4 months ago
- GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.☆71,070Updated this week
- The simplest way to run LLaMA on your local machine☆13,098Updated 5 months ago
- Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.☆101,577Updated this week
- LlamaIndex is a data framework for your LLM applications☆37,200Updated this week
- Universal LLM Deployment Engine with ML Compilation☆19,346Updated this week
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆20,634Updated 4 months ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,096Updated 6 months ago
- Inference Llama 2 in one file of pure C☆17,542Updated 4 months ago
- Inference code for CodeLlama models☆16,086Updated 4 months ago
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamical…☆37,109Updated 3 months ago
- Large Language Model Text Generation Inference☆9,206Updated this week
- Locally run an Instruction-Tuned Chat-Style LLM☆10,251Updated last year
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,287Updated 4 months ago
- Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory☆19,035Updated this week
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆7,993Updated 7 months ago
- 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading☆9,276Updated 3 months ago
- Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom dataset…☆15,516Updated this week
- The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on…☆26,888Updated this week
- Official inference library for Mistral models☆9,784Updated last month
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆37,791Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆10,892Updated this week