GavinZhengOI / LiveCodeBench-ProLinks
☆100Updated last month
Alternatives and similar repositories for LiveCodeBench-Pro
Users that are interested in LiveCodeBench-Pro are comparing it to the libraries listed below
Sorting:
- Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!☆45Updated 2 months ago
- The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆147Updated 3 weeks ago
- ☆77Updated 2 months ago
- 🚀 SWE-bench Goes Live!☆80Updated last week
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆79Updated last month
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆87Updated 2 months ago
- ☆50Updated last week
- Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆87Updated 3 weeks ago
- Scaling Computer-Use Grounding via UI Decomposition and Synthesis☆79Updated last week
- ☆58Updated last week
- General Reasoner: Advancing LLM Reasoning Across All Domains☆142Updated 2 weeks ago
- Revisiting Mid-training in the Era of RL Scaling☆62Updated 2 months ago
- Repo for "Z1: Efficient Test-time Scaling with Code"☆61Updated 2 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆46Updated last month
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆47Updated 2 weeks ago
- Efficient Agent Training for Computer Use☆106Updated 3 weeks ago
- repo for paper https://arxiv.org/abs/2504.13837☆164Updated this week
- Code for "Reasoning to Learn from Latent Thoughts"☆105Updated 3 months ago
- [NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI☆102Updated 3 months ago
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.☆17Updated 2 months ago
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆64Updated last month
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated this week
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning☆52Updated 3 weeks ago
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆44Updated 4 months ago
- ☆35Updated 2 weeks ago
- SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks☆55Updated last week
- Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI, derived from Ling.☆81Updated last week
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆101Updated last month
- Reproducing R1 for Code with Reliable Rewards☆222Updated last month
- Code for Paper: Learning Adaptive Parallel Reasoning with Language Models☆108Updated 2 months ago