JohannesGaessler / llama.cppLinks
Port of Facebook's LLaMA model in C/C++
☆12Updated this week
Alternatives and similar repositories for llama.cpp
Users that are interested in llama.cpp are comparing it to the libraries listed below
Sorting:
- Web UI for ExLlamaV2☆513Updated last year
- The official API server for Exllama. OAI compatible, lightweight, and fast.☆1,121Updated 2 weeks ago
- ☆674Updated 3 weeks ago
- A gradio web UI for running Large Language Models like GPT-J 6B, OPT, GALACTICA, LLaMA, and Pygmalion.☆310Updated 2 years ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,908Updated 2 years ago
- A multimodal, function calling powered LLM webui.☆216Updated last year
- An AI assistant beyond the chat box.☆329Updated last year
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …☆615Updated 11 months ago
- A llama.cpp drop-in replacement for OpenAI's GPT endpoints, allowing GPT-powered apps to run off local llama.cpp models instead of OpenAI…☆597Updated 2 years ago
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,562Updated 10 months ago
- Falcon LLM ggml framework with CPU and GPU support☆249Updated 2 years ago
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆626Updated last week
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,879Updated 2 years ago
- ☆535Updated 2 years ago
- Large-scale LLM inference engine☆1,647Updated 2 weeks ago
- C++ implementation for 💫StarCoder☆459Updated 2 years ago
- An autonomous AI agent extension for Oobabooga's web ui☆173Updated 2 years ago
- Self-evaluating interview for AI coders☆600Updated 7 months ago
- A fast batching API to serve LLM models☆189Updated last year
- Simple go utility to download HuggingFace Models and Datasets☆874Updated last week
- LLM Frontend in a single html file☆694Updated last month
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,440Updated 2 months ago
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆165Updated last year
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆854Updated last year
- Simplified installers for oobabooga/text-generation-webui.☆565Updated 2 years ago
- Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.☆569Updated last year
- KoboldAI is generative AI software optimized for fictional use, but capable of much more!☆422Updated last year
- Customizable implementation of the self-instruct paper.☆1,049Updated last year
- Tune any FALCON in 4-bit☆463Updated 2 years ago
- A proof-of-concept project that showcases the potential for using small, locally trainable LLMs to create next-generation documentation t…☆541Updated 2 years ago