JudgmentLabs / judgevalLinks
The open source post-building layer for agents. Our environment data and evals power agent post-training (RL, SFT) and monitoring.
☆1,016Updated last week
Alternatives and similar repositories for judgeval
Users that are interested in judgeval are comparing it to the libraries listed below
Sorting:
- OSS RL environment + evals toolkit☆267Updated this week
- The everything tool for model alignment☆62Updated 4 months ago
- A month-long, open-source AI Agent Hackathon — open to all builders and dreamers working on agents, RAG, tool use, and multi-agent system…☆243Updated 6 months ago
- A Python library for LLM-based evaluation using weighted rubrics.☆43Updated this week
- A catalogue of existing Nanda servers☆190Updated 8 months ago
- An interface library for RL post training with environments.☆973Updated this week
- 🎨 NeMo Data Designer: A general library for generating high-quality synthetic data from scratch or based on seed data.☆620Updated this week
- An alignment auditing agent capable of quickly exploring alignment hypothesis☆791Updated this week
- A repository consisting of paper/architecture replications of classic/SOTA AI/ML papers in pytorch☆399Updated 2 months ago
- Open source codebase for Scale Agentex☆253Updated this week
- NdLinear by Ensemble is a drop-in PyTorch module that shrinks your models with no accuracy loss. It powers the Ensemble Platform—upload a…☆298Updated 7 months ago
- ☆101Updated 7 months ago
- A multi-agent orchestration framework that works with any agent framework☆232Updated 7 months ago
- Find the Root Cause in Your Code's Trace☆377Updated this week
- From idea to production in just few lines: Graph-Based Programmable Neuro-Symbolic LM Framework - a production-first LM framework built w…☆395Updated 2 weeks ago
- The AI Browser Automation Framework☆380Updated this week
- Agent File (.af): An open file format for serializing stateful AI agents with persistent memory and behavior. Share, checkpoint, and vers…☆984Updated last month
- Repository of implementations of classic and sota rl algorithms from scratch in PyTorch☆216Updated last week
- On the Theoretical Limitations of Embedding-Based Retrieval☆618Updated 3 months ago
- ☆1,269Updated this week
- the os for claude code☆167Updated 2 months ago
- ☆235Updated last week
- Tool for generating high quality Synthetic datasets☆1,455Updated 2 months ago
- Orchestrate zero-shot computer vision models☆392Updated last year
- Curated collection of community environments☆200Updated this week
- The official Python library for Arklex framework☆689Updated last week
- Dynamiq is an orchestration framework for agentic AI and LLM applications☆1,018Updated this week
- Data Infrastructure providing a declarative, incremental approach for multimodal AI workloads.☆1,527Updated this week
- This project transforms a traditional portfolio into an interactive experience by allowing visitors to have a conversation with an AI-pow…☆26Updated 5 months ago
- An open-source tool for LLM prompt optimization.☆738Updated last week