Maximilian-Winter / llama-cpp-agentLinks
The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured output. Works also with models not fine-tuned to JSON output and function calls.
☆599Updated 8 months ago
Alternatives and similar repositories for llama-cpp-agent
Users that are interested in llama-cpp-agent are comparing it to the libraries listed below
Sorting:
- function calling-based LLM agents☆289Updated last year
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆617Updated 11 months ago
- A fast batching API to serve LLM models☆188Updated last year
- A multimodal, function calling powered LLM webui.☆216Updated last year
- ☆1,097Updated last year
- The RunPod worker template for serving our large language model endpoints. Powered by vLLM.☆371Updated last month
- Web UI for ExLlamaV2☆510Updated 8 months ago
- An AI assistant beyond the chat box.☆327Updated last year
- Large-scale LLM inference engine☆1,567Updated last week
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆221Updated last year
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆187Updated last year
- A tool for generating function arguments and choosing what function to call with local LLMs☆430Updated last year
- Querying local documents, powered by LLM☆628Updated 3 months ago
- 🚀 Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.☆411Updated 5 months ago
- Convenience scripts to finetune (chat-)LLaMa3 and other models for any language☆316Updated last year
- This is our own implementation of 'Layer Selective Rank Reduction'☆239Updated last year
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆165Updated last year
- Software to implement GoT with a weviate vectorized database☆677Updated 6 months ago
- A python package for developing AI applications with local LLMs.☆151Updated 9 months ago
- Task-based Agentic Framework using StrictJSON as the core☆458Updated 3 weeks ago
- Falcon LLM ggml framework with CPU and GPU support☆247Updated last year
- C++ implementation for 💫StarCoder☆455Updated 2 years ago
- Customizable implementation of the self-instruct paper.☆1,050Updated last year
- The easiest, and fastest way to run AI-generated Python code safely☆335Updated 10 months ago
- Efficient visual programming for AI language models☆361Updated 5 months ago
- ☆207Updated last year
- One click templates for inferencing Language Models☆213Updated 2 months ago
- ☆206Updated last month
- ☆162Updated 2 months ago
- FastMLX is a high performance production ready API to host MLX models.☆331Updated 7 months ago