chroma-core / context-rotLinks
This repository contains the toolkit for replicating results from our technical report.
☆163Updated 2 months ago
Alternatives and similar repositories for context-rot
Users that are interested in context-rot are comparing it to the libraries listed below
Sorting:
- ☆235Updated 4 months ago
- Real-Time Detection of Hallucinated Entities in Long-Form Generation☆268Updated this week
- Ranking LLMs on agentic tasks☆199Updated this week
- An alignment auditing agent capable of quickly exploring alignment hypothesis☆652Updated last week
- ☆79Updated last month
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆477Updated 2 months ago
- ☆186Updated last week
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆449Updated 2 months ago
- MCP-based Agent Deep Evaluation System☆138Updated last month
- MCP-Universe is a comprehensive framework designed for developing, testing, and benchmarking AI agents☆486Updated last week
- frozen-in-time version of our Paper Finder agent for reproducing evaluation results☆205Updated 2 months ago
- Public repository containing METR's DVC pipeline for eval data analysis☆129Updated 7 months ago
- Verifiers for LLM Reinforcement Learning☆78Updated 2 months ago
- An agent benchmark with tasks in a simulated software company.☆581Updated last month
- ☆216Updated 3 weeks ago
- Routing on Random Forest (RoRF)☆219Updated last year
- A Text-Based Environment for Interactive Debugging☆276Updated this week
- An Automatic Prompt Optimization Framework for Large Language Models☆137Updated 3 months ago
- A framework for fine-tuning retrieval-augmented generation (RAG) systems.☆135Updated this week
- Beating the GAIA benchmark with Transformers Agents. 🚀☆138Updated 9 months ago
- OSS RL environment + evals toolkit☆200Updated last week
- The Granite Guardian models are designed to detect risks in prompts and responses.☆120Updated last month
- Tutorial for building LLM router☆235Updated last year
- Provider-agnostic, open-source evaluation infrastructure for language models☆653Updated this week
- Official Repo for CRMArena and CRMArena-Pro☆125Updated last week
- Readymade evaluators for agent trajectories☆385Updated 2 months ago
- Context Engineering Course with DSPy☆202Updated 3 months ago
- A list of AI memory projects☆247Updated 10 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆105Updated 7 months ago
- Coding an LLM and its building blocks from scratch.☆100Updated 7 months ago