PRIME-RL / P1Links
P1: Mastering Physics Olympiads with Reinforcement Learning
☆67Updated 3 weeks ago
Alternatives and similar repositories for P1
Users that are interested in P1 are comparing it to the libraries listed below
Sorting:
- Reinforcing General Reasoning without Verifiers☆92Updated 5 months ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆57Updated 9 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆48Updated 4 months ago
- Official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆54Updated last month
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning