General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆222Nov 27, 2025Updated 3 months ago
Alternatives and similar repositories for General-Reasoner
Users that are interested in General-Reasoner are comparing it to the libraries listed below
Sorting:
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆48Oct 16, 2025Updated 5 months ago
- A series of technical report on Slow Thinking with LLM☆761Aug 13, 2025Updated 7 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆50Feb 4, 2026Updated last month
- ☆334May 31, 2025Updated 9 months ago
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆28Mar 1, 2025Updated last year
- More reliable Video Understanding Evaluation☆14Sep 23, 2025Updated 5 months ago
- Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".☆222Jun 24, 2025Updated 8 months ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆145Nov 13, 2025Updated 4 months ago
- Official Repo for Open-Reasoner-Zero☆2,086Jun 2, 2025Updated 9 months ago
- [TMLR] Process Reward Models That Think☆82Nov 29, 2025Updated 3 months ago
- Scaling RL on advanced reasoning models☆676Oct 20, 2025Updated 5 months ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆99Apr 9, 2025Updated 11 months ago
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆75May 20, 2025Updated 10 months ago
- A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning☆287Sep 25, 2025Updated 5 months ago
- Async pipelined version of Verl☆124Apr 8, 2025Updated 11 months ago
- Technical report of Kimina-Prover Preview.☆364Jul 10, 2025Updated 8 months ago
- A version of verl to support diverse tool use☆911Mar 2, 2026Updated 3 weeks ago
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆73Feb 25, 2025Updated last year
- FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones☆64Jan 26, 2026Updated last month
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient☆66Aug 3, 2025Updated 7 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆184Jun 5, 2025Updated 9 months ago
- An Ultra-Long Output Reinforcement Learning Approach☆23Jul 31, 2025Updated 7 months ago
- [NeurIPS 2025] RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning☆52Oct 23, 2025Updated 5 months ago
- ☆16Sep 4, 2025Updated 6 months ago
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,232Aug 27, 2025Updated 6 months ago
- The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"☆22Nov 9, 2025Updated 4 months ago
- instruction-following benchmark for large reasoning models☆44Aug 9, 2025Updated 7 months ago
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]☆38Feb 1, 2026Updated last month
- [AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆94Nov 8, 2025Updated 4 months ago
- Official repo of dataset-decomposition paper [NeurIPS 2024]☆21Jan 8, 2025Updated last year
- ☆17Aug 1, 2025Updated 7 months ago
- Unleashing the Power of Reinforcement Learning for Math and Code Reasoners☆743Jun 6, 2025Updated 9 months ago
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆4,261Nov 13, 2025Updated 4 months ago
- [ICLR 2025] BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆194Sep 13, 2025Updated 6 months ago
- [NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning☆159Sep 19, 2025Updated 6 months ago
- ☆46Jun 24, 2025Updated 8 months ago
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆27Oct 14, 2025Updated 5 months ago
- Scalable RL solution for advanced reasoning of language models☆1,821Mar 18, 2025Updated last year
- [ICLR2026] Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping☆63May 22, 2025Updated 10 months ago