stepfun-ai / PaCoReLinks
PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning
☆313Updated last week
Alternatives and similar repositories for PaCoRe
Users that are interested in PaCoRe are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆191Updated 7 months ago
- Towards a Unified View of Large Language Model Post-Training☆201Updated 5 months ago
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆255Updated 6 months ago
- MiroRL is an MCP-first reinforcement learning framework for deep research agent.☆229Updated 5 months ago
- A construction kit for reinforcement learning environment management.☆326Updated this week
- MiroTrain is an efficient and algorithm-first framework research agent.☆132Updated 5 months ago
- GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 tr…☆322Updated 3 months ago
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆469Updated 8 months ago
- 🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal rei…☆196Updated 2 months ago
- Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.☆90Updated 3 months ago
- The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning☆331Updated 8 months ago
- Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization☆374Updated last month
- ☆111Updated 4 months ago
- ☆236Updated last week
- Scaling RL on advanced reasoning models☆662Updated 3 months ago
- [ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆353Updated 3 weeks ago
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆216Updated 2 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆182Updated 6 months ago
- ☆230Updated last month
- Official JAX implementation of End-to-End Test-Time Training for Long Context☆520Updated 2 weeks ago
- [NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis☆148Updated 3 months ago
- [NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example☆408Updated 2 months ago
- The official repository of paper "Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models''☆111Updated 5 months ago
- d3LLM: Ultra-Fast Diffusion LLM 🚀☆90Updated last week
- OpenTinker is an RL-as-a-Service infrastructure for foundation models☆625Updated 2 weeks ago
- Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.☆164Updated 4 months ago
- [Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments☆177Updated last month
- [ICLR 2026] TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆423Updated 2 weeks ago
- [ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.☆484Updated 2 months ago
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆104Updated 4 months ago