gso-bench / gsoLinks
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
☆19Updated last week
Alternatives and similar repositories for gso
Users that are interested in gso are comparing it to the libraries listed below
Sorting:
- r2e: turn any github repository into a programming agent environment☆124Updated last month
- Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆73Updated last month
- Can Language Models Solve Olympiad Programming?☆116Updated 4 months ago
- A benchmark for LLMs on complicated tasks in the terminal☆141Updated this week
- ☆180Updated last month
- RepoQA: Evaluating Long-Context Code Understanding☆108Updated 7 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆119Updated this week
- ☆24Updated 7 months ago
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation