Python package wrapping llama.cpp for on-device LLM inference
☆103Apr 2, 2026Updated 2 weeks ago
Alternatives and similar repositories for easy-llama
Users that are interested in easy-llama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- entropix style sampling + GUI☆27Oct 30, 2024Updated last year
- Simple Summarizer Tool using Llama 3 8b.☆10May 14, 2024Updated last year
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.☆22Jan 5, 2026Updated 3 months ago
- An API for VoiceCraft.☆25Jun 27, 2024Updated last year
- A simple no-install web UI for Ollama and OAI-Compatible APIs!☆31Jan 30, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆94Mar 28, 2026Updated 2 weeks ago
- ☆83Feb 28, 2025Updated last year
- ☆35May 9, 2024Updated last year
- Terminal Voice Assistant is a powerful and flexible tool designed to help users interact with their terminal using natural language comma…☆19Jun 9, 2024Updated last year
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- A bot that checks your grammar and phrasing using LLM of choice☆32Feb 6, 2025Updated last year
- Web UI for ExLlamaV2☆511Feb 5, 2025Updated last year
- run ollama & gguf easily with a single command☆52May 15, 2024Updated last year
- an auto-sleeping and -waking framework around llama.cpp☆12Feb 8, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆45Mar 21, 2024Updated 2 years ago
- Llama.cui is a small llama.cpp-based chat application for Node.js☆20Jul 10, 2025Updated 9 months ago
- ☆12Jan 19, 2024Updated 2 years ago
- A simple library for working with Hugging Face models.☆14Dec 30, 2024Updated last year
- Server plugin to extract text from Office documents using the officeparser library.☆13Mar 14, 2026Updated last month
- Use Codestral Mamba with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.☆29Jul 18, 2024Updated last year
- Accepts a Hugging Face model URL, automatically downloads and quantizes it using Bits and Bytes.☆38Mar 12, 2024Updated 2 years ago
- A QT GUI for large language models☆40Dec 27, 2023Updated 2 years ago
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …☆625Mar 9, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ChatGPT CSS style☆14Apr 28, 2024Updated last year
- A simple character editor for v2 Tavern Character Cards☆62Jan 18, 2025Updated last year
- Web page with political compass quiz results for open LLMs☆38Jan 31, 2024Updated 2 years ago
- Writing Extension for Text Generation WebUI☆67Aug 7, 2025Updated 8 months ago
- convert a saved pytorch model to gguf and generate as much corresponding ggml c code as possible☆15Dec 19, 2023Updated 2 years ago
- LLM backed Fantasy Tribe Game☆19Nov 21, 2024Updated last year
- A Qt GUI for large language models☆45Nov 17, 2023Updated 2 years ago
- Kosmos-2.5 is a cutting-edge Multimodal-LLM (MLLM) specializing in image OCR. However, its stringent software requirements & Python-scrip…☆68Jul 22, 2024Updated last year
- An F/OSS solution combining AI with Wikipedia knowledge via a RAG pipeline☆99Jan 12, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- DOD or data oriented design development, what is it and how to do it☆34Updated this week
- A simple frontend page to interact with an OpenAI like API☆16Jan 31, 2025Updated last year
- ☆43Aug 2, 2025Updated 8 months ago
- Give your local LLM a real memory with a lightweight, fully local memory system. 100% offline and under your control.☆70Sep 16, 2025Updated 7 months ago
- A multimodal inference pipeline that integrates InstructBLIP with textgen-webui for Vicuna and related models.☆33Jul 14, 2023Updated 2 years ago
- Groquments is a simple demonstration project showcasing how easily PocketGroq can help developers integrate Groq's powerful AI capabiliti…☆12Sep 19, 2024Updated last year
- A portable linker for multiple file formats.☆14Aug 28, 2023Updated 2 years ago