Xuekai-Zhu / FlowRLLinks
☆132Updated 2 months ago
Alternatives and similar repositories for FlowRL
Users that are interested in FlowRL are comparing it to the libraries listed below
Sorting:
- ☆348Updated 6 months ago
- TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆398Updated last month
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆51Updated 6 months ago
- repo for paper https://arxiv.org/abs/2504.13837☆323Updated last month
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆174Updated 4 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆124Updated 10 months ago
- A repo for open research on building large reasoning models☆130Updated this week
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆350Updated 2 weeks ago
- Towards a Unified View of Large Language Model Post-Training☆199Updated 4 months ago
- ☆204Updated last month
- A Sober Look at Language Model Reasoning☆92Updated 2 months ago
- ☆64Updated 3 months ago
- P1: Mastering Physics Olympiads with Reinforcement Learning☆73Updated last month
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆197Updated 10 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆116Updated 5 months ago
- [COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models☆138Updated last month
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆99Updated last year
- ☆130Updated this week
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆70Updated 9 months ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆411Updated 6 months ago
- ThetaEvolve: Test-time Learning on Open Problems, enabling RL training on AlphaEvolve/OpenEvolve and emphasizing scaling test-time comput…☆107Updated last month
- ☆110Updated 4 months ago
- 📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.☆348Updated 2 months ago
- Official implementation of the NeurIPS 2025 paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"☆304Updated this week
- [Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments☆175Updated 2 weeks ago
- [arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies☆56Updated 3 weeks ago
- ☆379Updated 2 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆259Updated 8 months ago
- Implementation of Negative-aware Finetuning (NFT) algorithm for "Bridging Supervised Learning and Reinforcement Learning in Math Reasonin…☆67Updated 4 months ago
- [ICLR 2026] Geometric-Mean Policy Optimization☆98Updated this week