thuml / RLVR-WorldLinks
Official repository for "RLVR-World: Training World Models with Reinforcement Learning", https://arxiv.org/abs/2505.13934
☆30Updated last week
Alternatives and similar repositories for RLVR-World
Users that are interested in RLVR-World are comparing it to the libraries listed below
Sorting:
- ☁️ KUMO: Generative Evaluation of Complex Reasoning in Large Language Models☆18Updated last month
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆68Updated this week
- Code for Paper: Learning Adaptive Parallel Reasoning with Language Models☆94Updated last month
- Official Repository of LatentSeek☆30Updated last week
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…☆29Updated 2 weeks ago
- Official PyTorch Implementation of the Longhorn Deep State Space Model☆50Updated 6 months ago
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆44Updated 2 weeks ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆36Updated last week
- ☆13Updated 2 months ago
- ☆40Updated 3 weeks ago
- Code for "Reasoning to Learn from Latent Thoughts"☆104Updated 2 months ago
- ☆102Updated last month
- ☆59Updated 2 months ago
- Unsupervised GRPO☆24Updated this week
- Official implementation of paper "ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting" (CVPR 2025)☆41Updated last month
- Natural Language Reinforcement Learning☆89Updated 5 months ago
- ☆83Updated last month
- ☆24Updated 11 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆48Updated 6 months ago
- The official implementation for Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free☆38Updated 3 weeks ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆56Updated 2 months ago
- Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆36Updated last week
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆60Updated 5 months ago
- ☆231Updated last week
- Webpage for RLHFlow☆9Updated 4 months ago
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆16Updated last month
- This is the repository for paper EscapeBench: Pushing Language Models to Think Outside the Box☆14Updated 5 months ago
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆55Updated 4 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning"☆103Updated last week
- Uni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024…☆36Updated 6 months ago