JudgmentLabs / judgevalLinks
Rigorously test, monitor, and optimize agent systems—grounded in research from Stanford and Berkeley AI Labs.
☆85Updated this week
Alternatives and similar repositories for judgeval
Users that are interested in judgeval are comparing it to the libraries listed below
Sorting:
- Cloudstate is a JavaScript database runtime.☆179Updated 2 months ago
- Prompt engineering, automated.☆329Updated 2 months ago
- AI benchmark runtime framework that allows you to integrate and evaluate AI tasks using Docker-based benchmarks.☆152Updated last month
- vscode extension to convert computationally intensive pytorch kernels to triton☆22Updated 8 months ago
- ☆49Updated this week
- ☆77Updated 3 months ago
- ACP is the Agent Control Plane - a distributed agent scheduler optimized for simplicity, clarity, and control. It is designed for outer-l…☆83Updated last week
- Yet Another Web Extraction SDK☆47Updated last week
- ☆62Updated 2 months ago
- A customizable, general purpose AI Agent that supports MCP. Talk to Saiki in natural language to control computers, applications and more…☆155Updated this week
- Universal language-agnostic AST walking and accurate call stack generation with tree-sitter.☆106Updated 10 months ago
- ☆134Updated 3 months ago
- 🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨☆93Updated 10 months ago
- Deploy Astro.js to freestyle.sh with cloudstate javascript object persistence.☆49Updated 4 months ago
- OpenInt is the fastest way to add native product integrations to your app.☆182Updated last week
- An MCP Server that's also an MCP Client. Useful for letting Claude develop and test MCPs without needing to reset the application.☆120Updated 3 months ago
- The easiest way to use GPUs.☆110Updated 2 weeks ago
- Find the best ML model for your use case | Y Combinator Fall 2024☆21Updated 8 months ago
- Build AI Agents with Your Existing Python Code!☆58Updated 7 months ago
- Open Source Lovable☆74Updated this week
- 🐍 Sublingual helps you log and analyze all of your LLM calls, including the prompt template, call parameters, responses, tool calls, and…☆51Updated 3 months ago
- A new chunking strategy developed by ZeroEntropy for general semantic chunking using Llama-70B.☆185Updated 4 months ago
- Agent computer interface for AI software engineer.☆85Updated this week
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated last year
- A prompt management, versioning, testing, and evaluation inference server and UI toolkit. Provider agnostic and OpenAI API compatible.☆93Updated 2 weeks ago
- Blockoli is a high-performance tool for code indexing, embedding generation and semantic search tool for use with LLMs.☆117Updated last year
- 🤖 Headless IDE for AI agents☆191Updated 2 months ago
- Multi-language code navigation API in a container☆80Updated last month
- ☆90Updated 3 months ago
- Open source payments + billing infrastructure☆174Updated this week