Maximilian-Winter / llama-cpp-agent
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
☆558Updated 2 months ago
Alternatives and similar repositories for llama-cpp-agent:
Users that are interested in llama-cpp-agent are comparing it to the libraries listed below
- function calling-based LLM agents☆285Updated 7 months ago
- ☆868Updated 7 months ago
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆582Updated 6 months ago
- A multimodal, function calling powered LLM webui.☆214Updated 7 months ago
- A fast batching API to serve LLM models☆182Updated last year
- Efficient visual programming for AI language models☆359Updated 7 months ago
- Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible☆311Updated 2 months ago
- FastMLX is a high performance production ready API to host MLX models.☆297Updated last month
- An AI assistant beyond the chat box.☆328Updated last year
- Web UI for ExLlamaV2☆493Updated 3 months ago
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆152Updated 11 months ago
- Large-scale LLM inference engine☆1,405Updated last week
- 🚀 Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.☆359Updated last week
- The RunPod worker template for serving our large language model endpoints. Powered by vLLM.☆309Updated this week
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆223Updated last year
- SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.☆264Updated this week
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆251Updated 2 months ago
- A python package for developing AI applications with local LLMs.☆149Updated 4 months ago
- Software to implement GoT with a weviate vectorized database☆663Updated last month
- Convenience scripts to finetune (chat-)LLaMa3 and other models for any language☆305Updated 10 months ago
- automatically quant GGUF models☆174Updated this week
- ☆288Updated last month
- Open source alternative to Perplexity AI with ability to run locally☆202Updated 6 months ago
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆181Updated 9 months ago
- An OAI compatible exllamav2 API that's both lightweight and fast☆940Updated this week
- ☆201Updated 2 weeks ago
- A library for easily merging multiple LLM experts, and efficiently train the merged LLM.☆472Updated 8 months ago
- Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.☆115Updated 11 months ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆237Updated 11 months ago
- Stateful load balancer custom-tailored for llama.cpp 🏓🦙☆747Updated last week