Maximilian-Winter / llama-cpp-agentLinks
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
☆566Updated 3 months ago
Alternatives and similar repositories for llama-cpp-agent
Users that are interested in llama-cpp-agent are comparing it to the libraries listed below
Sorting:
- function calling-based LLM agents☆285Updated 8 months ago
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆592Updated 7 months ago
- ☆893Updated 8 months ago
- A fast batching API to serve LLM models☆181Updated last year
- A multimodal, function calling powered LLM webui.☆214Updated 8 months ago
- An AI assistant beyond the chat box.☆329Updated last year
- Web UI for ExLlamaV2☆495Updated 3 months ago
- Efficient visual programming for AI language models☆361Updated 2 weeks ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆238Updated last year
- A python package for developing AI applications with local LLMs.☆150Updated 4 months ago
- SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.☆267Updated last week
- FastMLX is a high performance production ready API to host MLX models.☆305Updated 2 months ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆221Updated last year
- Large-scale LLM inference engine☆1,435Updated this week
- Convenience scripts to finetune (chat-)LLaMa3 and other models for any language☆309Updated 11 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆171Updated last year
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆153Updated last year
- ☆202Updated 2 weeks ago
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆254Updated 2 months ago
- ☆157Updated 10 months ago
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆614Updated this week
- 🚀 Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.☆378Updated 3 weeks ago
- Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible☆315Updated 3 months ago
- The RunPod worker template for serving our large language model endpoints. Powered by vLLM.☆317Updated 2 weeks ago
- A simple Python sandbox for helpful LLM data agents☆264Updated 11 months ago
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆185Updated 10 months ago
- Open-source Perplexity app.☆126Updated 2 months ago
- Stateful load balancer custom-tailored for llama.cpp 🏓🦙☆764Updated this week
- A library for easily merging multiple LLM experts, and efficiently train the merged LLM.☆477Updated 9 months ago
- The official API server for Exllama. OAI compatible, lightweight, and fast.☆962Updated last week