JohannesGaessler / llama.cpp

Port of Facebook's LLaMA model in C/C++

☆11

Alternatives and similar repositories for llama.cpp:

Users that are interested in llama.cpp are comparing it to the libraries listed below

cmp-nct / ggllm.cpp
Falcon LLM ggml framework with CPU and GPU support
☆246Updated last year
bigcode-project / starcoder.cpp
C++ implementation for 💫StarCoder
☆450Updated last year
wawawario2 / long_term_memory
A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, LLaMA, and Pygmalion.
☆308Updated last year
itsme2417 / PolyMind
A multimodal, function calling powered LLM webui.
☆213Updated 4 months ago
leafspark / AutoGGUF
automatically quant GGUF models
☆151Updated this week
epolewski / EricLLM
A fast batching API to serve LLM models
☆180Updated 9 months ago
flurb18 / AgentOoba
An autonomous AI agent extension for Oobabooga's web ui
☆176Updated last year
AndrewVeee / nucleo-ai
An AI assistant beyond the chat box.
☆317Updated 10 months ago
abetlen / ggml-python
Python bindings for ggml
☆136Updated 4 months ago
mamei16 / LLM_Web_search
An extension for oobabooga/text-generation-webui that enables the LLM to search the web using DuckDuckGo
☆201Updated this week
Nuggt-dev / Nuggt
An Autonomous LLM Agent that runs on Wizcoder-15B
☆338Updated 3 months ago
turboderp-org / exui
Web UI for ExLlamaV2
☆470Updated this week
TheBlokeAI / dockerLLM
TheBloke's Dockerfiles
☆301Updated 10 months ago
eugenepentland / landmark-attention-qlora
Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA
☆123Updated last year
skeskinen / bert.cpp
ggml implementation of BERT
☆476Updated 11 months ago
teknium1 / alpaca-roleplay-discordbot
A discord bot that roleplays!
☆147Updated last year
Potatooff / Le-Potato
Simple. elegant LLM Chat Inference
☆24Updated 7 months ago
monatis / clip.cpp
CLIP inference in plain C/C++ with no extra dependencies
☆475Updated 5 months ago
PotatoSpudowski / fastLLaMa
fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backe…
☆408Updated last year
Maximilian-Winter / llama-cpp-agent
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …
☆522Updated last month
kalomaze / koboldcpp
My personal fork of koboldcpp where I hack in experimental samplers.
☆43Updated 8 months ago
galatolofederico / microchain
function calling-based LLM agents
☆283Updated 4 months ago
GiusTex / EdgeGPT
Extension for Text Generation Webui based on EdgeGPT, a reverse engineered API of Microsoft's Bing Chat AI
☆124Updated last year
runpod-workers / worker-vllm
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
☆273Updated this week
aigoopy / llm-jeopardy
Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts
☆111Updated last year
RWKV / rwkv.cpp
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
☆1,454Updated last week
ortegaalfredo / neurochat
Native gui to serveral AI services plus llama.cpp local AIs.
☆108Updated last year
Maximilian-Winter / guidance
A guidance language for controlling large language models.
☆44Updated last year
abacaj / replit-3B-inference
Run inference on replit-3B code instruct model using CPU
☆154Updated last year
latent-variable / Real_time_fallacy_detection
Real-time Fallacy Detection using OpenAI whisper and ChatGPT/LLaMA/Mistral
☆110Updated last year