empirical-run / empirical
Test and evaluate LLMs and model configurations, across all the scenarios that matter for your application
☆149Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for empirical
- ActBot is a prototype for an injectable chatbot to give any website agentic capabilities☆59Updated 5 months ago
- Run Open Source/Open Weight LLMs locally with OpenAI compatible APIs☆65Updated this week
- LLM-ready data connectors☆61Updated 5 months ago
- A curated list of open source repositories for AI Engineers☆80Updated last month
- Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, ba…☆215Updated this week
- Your automated SWE fleet to get your tickets from the Backlog to Prod!☆95Updated 6 months ago
- RAG on codebases using treesitter and LanceDB☆25Updated this week
- AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.☆220Updated this week
- Python SDK for running evaluations on LLM generated responses☆215Updated this week
- Using various instructor clients evaluating the quality and capabilities of extractions and reasoning.☆47Updated last month
- Prompt engineering, automated.☆240Updated this week
- ☆93Updated 3 weeks ago
- Solving data for LLMs - Create quality synthetic datasets!☆136Updated 3 weeks ago
- AutoGPT for Web App Development☆139Updated 11 months ago
- converts url content into JSON with a simple prefix☆60Updated 6 months ago
- Sidecar is the AI brains for the Aide editor and works alongside it, locally on your machine☆74Updated this week
- The open source AI app collection☆166Updated 8 months ago
- Foyle is a copilot to help developers deploy and operate their applications.☆105Updated this week
- ☆181Updated 6 months ago
- A curated list of amazingly awesome Modal applications, demos, and shiny things. Inspired by awesome-php.☆86Updated this week
- Open source AI Agent evaluation framework for web tasks 🐒🍌☆265Updated last week
- ☆104Updated this week
- Repository for fine-tuning gemma models using unsloth for indic languages☆82Updated 7 months ago
- ☆114Updated 4 months ago
- Full stack tools for building voice agents☆66Updated this week
- mahilo: Multi-Agent Human-in-the-Loop Framework is a flexible framework for creating multi-agent systems that can each interact with huma…☆52Updated this week
- Annoucing Instructor Cloud☆33Updated 2 months ago
- ☆125Updated last month
- 🤖 Headless IDE for AI agents☆128Updated this week