microsoft / debug-gymLinks
A Text-Based Environment for Interactive Debugging
☆275Updated this week
Alternatives and similar repositories for debug-gym
Users that are interested in debug-gym are comparing it to the libraries listed below
Sorting:
- Super basic implementation (gist-like) of RLMs with REPL environments.☆204Updated last week
- Agent computer interface for AI software engineer.☆110Updated last month
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆349Updated this week
- ☆79Updated last month
- A better way of testing, inspecting, and analyzing AI Agent traces.☆40Updated last week
- DSPy module for OpenAI Codex SDK - signature-driven agentic workflows☆124Updated this week
- A framework for optimizing DSPy programs with RL☆208Updated last week
- An alignment auditing agent capable of quickly exploring alignment hypothesis☆609Updated last week
- ☆152Updated 3 months ago
- Context Engineering Course with DSPy☆198Updated 3 months ago
- [NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents☆432Updated last week
- A multi-agent LLM system for detecting and resolving cognitive dissonance.☆268Updated 2 weeks ago
- ☆232Updated 3 months ago
- ☆80Updated last month
- An Automatic Prompt Optimization Framework for Large Language Models☆130Updated 2 months ago
- Test Generation for Prompts☆142Updated last week
- Coding problems used in aider's polyglot benchmark☆184Updated 10 months ago
- A Tree Search Library with Flexible API for LLM Inference-Time Scaling☆479Updated last week
- 🤖 Headless IDE for AI agents☆200Updated 3 weeks ago
- Verifiers for LLM Reinforcement Learning☆77Updated last month
- Prompts used in the Automated Auditing Blog Post☆114Updated 3 months ago
- The Prime Intellect CLI provides a powerful command-line interface for managing GPU resources across various providers☆100Updated this week
- MCP-Universe is a comprehensive framework designed for developing, testing, and benchmarking AI agents☆461Updated 2 weeks ago
- Benchmark and optimize LLM inference across frameworks with ease☆124Updated last month
- ☆47Updated 2 months ago
- ☆59Updated 9 months ago
- Letting Claude Code develop his own MCP tools :)☆123Updated 7 months ago
- frozen-in-time version of our Paper Finder agent for reproducing evaluation results☆192Updated 2 months ago
- ☆113Updated last week
- A mcp server that uses the Osmosis-Apply-1.7B model to apply code merges☆53Updated 3 months ago