commit-0 / commit0
Commit0: Library Generation from Scratch
☆144Updated 3 weeks ago
Alternatives and similar repositories for commit0:
Users that are interested in commit0 are comparing it to the libraries listed below
- r2e: turn any github repository into a programming agent environment☆116Updated 2 weeks ago
- ☆79Updated 2 weeks ago
- RepoQA: Evaluating Long-Context Code Understanding☆108Updated 6 months ago
- prime-rl is a codebase for decentralized RL training at scale☆85Updated this week
- Can Language Models Solve Olympiad Programming?☆116Updated 3 months ago
- Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!☆43Updated 3 weeks ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆172Updated last month
- SWE Arena☆33Updated 3 weeks ago
- A simple unified framework for evaluating LLMs☆209Updated 3 weeks ago
- ☆114Updated 2 months ago
- Train your own SOTA deductive reasoning model☆91Updated last month
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆94Updated this week
- LILO: Library Induction with Language Observations☆86Updated 8 months ago
- ☆123Updated last month
- Just a bunch of benchmark logs for different LLMs☆119Updated 9 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆126Updated 5 months ago
- Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions☆42Updated 9 months ago
- ☆85Updated last week
- ☆130Updated last month
- Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆64Updated 2 weeks ago
- ☆37Updated 3 months ago
- Long context evaluation for large language models☆206Updated 2 months ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym☆448Updated last month
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 3 months ago
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆150Updated this week
- EvaByte: Efficient Byte-level Language Models at Scale☆91Updated 2 weeks ago
- ☆40Updated 9 months ago
- Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…☆102Updated 3 weeks ago
- ☆60Updated last year
- Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.☆170Updated this week