Maximilian-Winter / llama-cpp-agentLinks
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
☆589Updated 6 months ago
Alternatives and similar repositories for llama-cpp-agent
Users that are interested in llama-cpp-agent are comparing it to the libraries listed below
Sorting:
- function calling-based LLM agents☆287Updated 11 months ago
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆608Updated 10 months ago
- A fast batching API to serve LLM models☆187Updated last year
- A multimodal, function calling powered LLM webui.☆216Updated 11 months ago
- ☆1,071Updated last year
- Convenience scripts to finetune (chat-)LLaMa3 and other models for any language☆313Updated last year
- Web UI for ExLlamaV2☆513Updated 7 months ago
- An AI assistant beyond the chat box.☆328Updated last year
- The RunPod worker template for serving our large language model endpoints. Powered by vLLM.☆365Updated last week
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆185Updated last year
- 🚀 Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.☆405Updated 4 months ago
- Efficient visual programming for AI language models☆362Updated 3 months ago
- Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible☆337Updated 6 months ago
- Software to implement GoT with a weviate vectorized database☆676Updated 5 months ago
- SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.☆278Updated 2 months ago
- ☆209Updated this week
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆164Updated last year
- This is our own implementation of 'Layer Selective Rank Reduction'☆240Updated last year
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆222Updated last year
- Large-scale LLM inference engine☆1,543Updated this week
- Task-based Agentic Framework using StrictJSON as the core☆456Updated 3 weeks ago
- A python package for developing AI applications with local LLMs.☆151Updated 8 months ago
- A tool for generating function arguments and choosing what function to call with local LLMs☆428Updated last year
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆259Updated 6 months ago
- The easiest, and fastest way to run AI-generated Python code safely☆328Updated 9 months ago
- FastMLX is a high performance production ready API to host MLX models.☆326Updated 5 months ago
- ☆161Updated last month
- This project demonstrates a basic chain-of-thought interaction with any LLM (Large Language Model)☆322Updated 11 months ago
- Falcon LLM ggml framework with CPU and GPU support☆247Updated last year
- TheBloke's Dockerfiles☆307Updated last year