JudgmentLabs / judgevalLinks

Rigorously test, monitor, and optimize agent systems—grounded in research from Stanford and Berkeley AI Labs.

☆85

Alternatives and similar repositories for judgeval

Users that are interested in judgeval are comparing it to the libraries listed below

Sorting:

freestyle-sh / cloudstate
Cloudstate is a JavaScript database runtime.
☆179Updated 2 months ago
zenbase-ai / core
Prompt engineering, automated.
☆329Updated 2 months ago
benchflow-ai / benchflow
AI benchmark runtime framework that allows you to integrate and evaluate AI tasks using Docker-based benchmarks.
☆152Updated last month
proxis-dev / vscode-triton
vscode extension to convert computationally intensive pytorch kernels to triton
☆22Updated 8 months ago
lmnr-ai / lmnr-python
☆49Updated this week
fixadev / fixa-observe
☆77Updated 3 months ago
humanlayer / agentcontrolplane
ACP is the Agent Control Plane - a distributed agent scheduler optimized for simplicity, clarity, and control. It is designed for outer-l…
☆83Updated last week
reworkd / harambe
Yet Another Web Extraction SDK
☆47Updated last week
opentoolsteam / cli
☆62Updated 2 months ago
truffle-ai / saiki
A customizable, general purpose AI Agent that supports MCP. Talk to Saiki in natural language to control computers, applications and more…
☆155Updated this week
getAsterisk / stackwalk
Universal language-agnostic AST walking and accurate call stack generation with tree-sitter.
☆106Updated 10 months ago
pig-dot-dev / piglet
☆134Updated 3 months ago
simplifine-llm / Simplifine
🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨
☆93Updated 10 months ago
freestyle-sh / freestyle-astro-template
Deploy Astro.js to freestyle.sh with cloudstate javascript object persistence.
☆49Updated 4 months ago
openintegrations / openint
OpenInt is the fastest way to add native product integrations to your app.
☆182Updated last week
willccbb / mcp-client-server
An MCP Server that's also an MCP Client. Useful for letting Claude develop and test MCPs without needing to reset the application.
☆120Updated 3 months ago
tensorpool / tensorpool
The easiest way to use GPUs.
☆110Updated 2 weeks ago
CarterMcClellan / supercontrast-legacy
Find the best ML model for your use case | Y Combinator Fall 2024
☆21Updated 8 months ago
Extensible-AI / DAGent
Build AI Agents with Your Existing Python Code!
☆58Updated 7 months ago
freestyle-sh / Adorable
Open Source Lovable
☆74Updated this week
sublingual-ai / sublingual-monitoring
🐍 Sublingual helps you log and analyze all of your LLM calls, including the prompt template, call parameters, responses, tool calls, and…
☆51Updated 3 months ago
zeroentropy-ai / zchunk
A new chunking strategy developed by ZeroEntropy for general semantic chunking using Llama-70B.
☆185Updated 4 months ago
All-Hands-AI / openhands-aci
Agent computer interface for AI software engineer.
☆85Updated this week
SohamGovande / podplex
🦾💻🌐 distributed training & serverless inference at scale on RunPod
☆17Updated last year
llmkit-ai / llmkit
A prompt management, versioning, testing, and evaluation inference server and UI toolkit. Provider agnostic and OpenAI API compatible.
☆93Updated 2 weeks ago
getAsterisk / blockoli
Blockoli is a high-performance tool for code indexing, embedding generation and semantic search tool for use with LLMs.
☆117Updated last year
hide-org / hide
🤖 Headless IDE for AI agents
☆191Updated 2 months ago
agentic-labs / lsproxy
Multi-language code navigation API in a container
☆80Updated last month
browserbase / slack-operator
☆90Updated 3 months ago
flowglad / flowglad
Open source payments + billing infrastructure
☆174Updated this week