promptfoo / promptfoo
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
☆5,832Updated this week
Alternatives and similar repositories for promptfoo:
Users that are interested in promptfoo are comparing it to the libraries listed below
- structured outputs for llms☆9,808Updated this week
- Adding guardrails to large language models.☆4,646Updated this week
- Structured Text Generation☆11,072Updated this week
- [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which ach…☆4,939Updated last week
- The LLM Evaluation Framework☆5,585Updated this week
- Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sag…☆18,941Updated this week
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.☆10,513Updated this week
- Supercharge Your LLM Application Evaluations 🚀☆8,457Updated last week
- 🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with Open…☆9,390Updated this week
- DSPy: The framework for programming—not prompting—language models☆22,470Updated this week
- NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.☆4,513Updated last week
- Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chro…☆2,810Updated 7 months ago
- A Bulletproof Way to Generate Structured JSON from Language Models☆4,643Updated last year
- An open-source visual programming environment for battle-testing prompts to LLMs.☆2,530Updated this week
- AI Observability & Evaluation☆5,075Updated this week
- Build Conversational AI in minutes ⚡️☆8,883Updated this week
- All things prompt engineering☆5,569Updated 9 months ago
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality☆3,705Updated 7 months ago
- The most advanced AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.☆5,507Updated this week
- An awesome & curated list of best LLMOps tools for developers☆4,586Updated last month
- Retrieval Augmented Generation (RAG) chatbot powered by Weaviate☆6,937Updated 2 weeks ago
- 🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓☆3,413Updated this week
- Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.☆10,964Updated last week
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.☆7,456Updated 6 months ago
- A language for constraint-guided and efficient LLM programming.☆3,864Updated 9 months ago
- Harness LLMs with Multi-Agent Programming☆3,158Updated this week
- A guidance language for controlling large language models.☆19,883Updated this week
- A blazing fast AI Gateway with integrated guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.☆7,310Updated last week
- ✨ AI agents that spark joy☆5,558Updated last week
- Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)☆11,891Updated this week