abetlen / llama-cpp-pythonLinks

Python bindings for llama.cpp

☆9,658

Alternatives and similar repositories for llama-cpp-python

Users that are interested in llama-cpp-python are comparing it to the libraries listed below

Sorting:

ggml-org / ggml
Tensor library for machine learning
☆13,261Updated last week
huggingface / text-generation-inference
Large Language Model Text Generation Inference
☆10,566Updated last month
turboderp-org / exllamav2
A fast inference library for running LLMs locally on modern consumer-class GPUs
☆4,341Updated 2 months ago
ggml-org / llama.cpp
LLM inference in C/C++
☆87,889Updated this week
AutoGPTQ / AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
☆4,965Updated 6 months ago
artidoro / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆10,697Updated last year
nlpxucan / WizardLM
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
☆9,453Updated 4 months ago
turboderp / exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
☆2,903Updated 2 years ago
axolotl-ai-cloud / axolotl
Go ahead and axolotl questions
☆10,634Updated this week
marella / ctransformers
Python bindings for the Transformer models implemented in C/C++ using GGML library.
☆1,876Updated last year
arcee-ai / mergekit
Tools for merging pretrained large language models.
☆6,378Updated last month
bitsandbytes-foundation / bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
☆7,647Updated 2 weeks ago
oobabooga / text-generation-webui
The definitive Web UI for local AI, with powerful features and easy setup.
☆45,205Updated this week
openlm-research / open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
☆7,524Updated 2 years ago
mlc-ai / mlc-llm
Universal LLM Deployment Engine with ML Compilation
☆21,471Updated this week
jzhang38 / TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
☆8,768Updated last year
tloen / alpaca-lora
Instruct-tune LLaMA on consumer hardware
☆18,965Updated last year
Lightning-AI / lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,077Updated 3 months ago
togethercomputer / RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
☆4,829Updated 10 months ago
bigscience-workshop / petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
☆9,808Updated last year
bentoml / OpenLLM
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
☆11,840Updated last week
lm-sys / FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
☆39,161Updated 4 months ago
antimatter15 / alpaca.cpp
Locally run an Instruction-Tuned Chat-Style LLM
☆10,205Updated 2 years ago
FMInference / FlexLLMGen
Running large language models on a single GPU for throughput-oriented scenarios.
☆9,367Updated 11 months ago
huggingface / trl
Train transformer language models with reinforcement learning.
☆15,934Updated this week
qwopqwop200 / GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
☆3,073Updated last year
imoneoi / openchat
OpenChat: Advancing Open-source Language Models with Imperfect Data
☆5,435Updated last year
tatsu-lab / stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
☆30,181Updated last year
dottxt-ai / outlines
Structured Outputs
☆12,712Updated this week
mit-han-lab / streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
☆7,070Updated last year