phonism / CP-ZeroLinks
Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.
โ18Updated 9 months ago
Alternatives and similar repositories for CP-Zero
Users that are interested in CP-Zero are comparing it to the libraries listed below
Sorting:
- [NeurIPS'24] Official code for *๐ฏDART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*โ120Updated last year
- Reproducing R1 for Code with Reliable Rewardsโ286Updated 9 months ago
- CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratingsโ65Updated last year
- The official repository of the Omni-MATH benchmark.โ93Updated last year
- Revisiting Mid-training in the Era of Reinforcement Learning Scalingโ182Updated 6 months ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluationsโ143Updated 2 months ago
- โ50Updated 5 months ago
- โ87Updated 5 months ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]โ96Updated 10 months ago
- โ215Updated 11 months ago
- โ80Updated 10 months ago
- โ78Updated last year
- Resources for the Enigmata Project.โ77Updated 5 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learningโ120Updated 9 months ago
- Async pipelined version of Verlโ124Updated 10 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"โ246Updated 4 months ago
- Model merging is a highly efficient approach for long-to-short reasoning.โ98Updated 3 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".โ116Updated 6 months ago
- This is the official implementation for paper "PENCIL: Long Thoughts with Short Memory".โ73Updated 9 months ago
- [NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learningโ149Updated 4 months ago
- โ74Updated 9 months ago
- A Sober Look at Language Model Reasoningโ92Updated 2 months ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejectionโ55Updated last year
- The code and data for the paper JiuZhang3.0โ49Updated last year
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correctionโ87Updated 10 months ago
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracyโ77Updated 4 months ago
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agentsโ236Updated 6 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]โ147Updated last year
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ273Updated last year
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied witโฆโ150Updated last year