rdi-berkeley/awesome-RLVR-boundary

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/rdi-berkeley/awesome-RLVR-boundary)

rdi-berkeley / awesome-RLVR-boundary

A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Language Models (LLMs).

☆89

Alternatives and similar repositories for awesome-RLVR-boundary

Users that are interested in awesome-RLVR-boundary are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hemingkx / Whisper
View on GitHub
[ACL 2026] Enabling Efficient Reasoning in LLMs via Black-box Persuasive Prompting
☆22Jan 9, 2026Updated 6 months ago
sunblaze-ucb / rl-grok-recipe
View on GitHub
Code repository for "RL Grokking Recipe: How RL Unlocks and Transfers New Algorithms in LLMs""
☆35Oct 12, 2025Updated 9 months ago
sail-sg / feedback-conditional-policy
View on GitHub
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆65Jan 5, 2026Updated 6 months ago
LeapLabTHU / limit-of-RLVR
View on GitHub
repo for paper https://arxiv.org/abs/2504.13837
☆346Dec 17, 2025Updated 7 months ago
microsoft / experiential_rl
View on GitHub
The official codebase for "Experiential Reinforcement Learning" - https://arxiv.org/pdf/2602.13949v1
☆76Jul 2, 2026Updated 3 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated last year
mandyyyyii / east
View on GitHub
☆19Aug 4, 2025Updated 11 months ago
ypwang61 / One-Shot-RLVR
View on GitHub
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆444Mar 11, 2026Updated 4 months ago
sail-sg / OPER
View on GitHub
code for the paper Offline Prioritized Experience Replay
☆12Jun 13, 2023Updated 3 years ago
liziniu / cold_start_rl
View on GitHub
Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?
☆20Mar 9, 2025Updated last year
TIGER-AI-Lab / General-Reasoner
View on GitHub
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆229Nov 27, 2025Updated 8 months ago
YiCheng98 / IntegrativeDecoding
View on GitHub
Official Implementation for the paper "Integrative Decoding: Improving Factuality via Implicit Self-consistency"
☆33Apr 12, 2025Updated last year
fjzzq2002 / WeightWatch
View on GitHub
Official Repository of Paper "Watch the Weights: Unsupervised monitoring and control of fine-tuned LLMs"
☆15Sep 25, 2025Updated 10 months ago
wangjs9 / Muffin
View on GitHub
Codes for Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback (ACL 2024 Findings)
☆17Jul 2, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
casper-hansen / OpenCoconut
View on GitHub
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆173Jan 16, 2025Updated last year
sparkle-reasoning / sparkle
View on GitHub
[NeurIPS'25] Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
☆16Dec 12, 2025Updated 7 months ago
rookie-joe / FormalAlign
View on GitHub
☆17Jul 12, 2025Updated last year
StarDewXXX / AdaR1
View on GitHub
The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"
☆24May 6, 2026Updated 2 months ago
sail-sg / Precision-RL
View on GitHub
Defeating the Training-Inference Mismatch via FP16
☆197Nov 14, 2025Updated 8 months ago
OPTML-Group / Unlearn-Smooth
View on GitHub
[ICML25] Official repo for "Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond…
☆24Sep 27, 2025Updated 10 months ago
linhaowei1 / SLD
View on GitHub
[ICLR26] AI-based scaling law discovery
☆31Jan 30, 2026Updated 5 months ago
jkatzsam / woods_ood
View on GitHub
☆16May 25, 2022Updated 4 years ago
shangshang-wang / Tora
View on GitHub
Tora: Torchtune-LoRA for RL
☆87Dec 2, 2025Updated 7 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
AheadOFpotato / Awesome-LRM-Mechanisms
View on GitHub
Towards a Mechanistic Understanding of Large Reasoning Models: A Survey of Training, Inference, and Failures
☆34Jan 29, 2026Updated 6 months ago
Gen-Verse / CURE
View on GitHub
[NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning
☆167Sep 19, 2025Updated 10 months ago
LARK-AI-Lab / CodeScaler
View on GitHub
The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"
☆35Mar 26, 2026Updated 4 months ago
princeton-pli / RLMT
View on GitHub
[R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"
☆129Oct 27, 2025Updated 9 months ago
chentong0 / rl-binary-rar
View on GitHub
Official repo for "Binary Retrieval-augmented Reward Mitigates Hallucinations"
☆15Nov 13, 2025Updated 8 months ago
yjywdzh / ACE
View on GitHub
This repository refers to the codes of paper ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall
☆15Jan 31, 2026Updated 5 months ago
hkust-nlp / Activation_Decoding
View on GitHub
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆64Mar 30, 2024Updated 2 years ago
franciscoliu / Awesome-GenAI-Unlearning
View on GitHub
☆188Apr 22, 2026Updated 3 months ago
AngelaZZZ-611 / reasoning_models_probing
View on GitHub
☆22May 14, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
kaishxu / DFMed
View on GitHub
Code and data for "Medical Dialogue Generation via Dual Flow Modeling" (ACL 2023 Findings)
☆14Nov 22, 2023Updated 2 years ago
hemingkx / Awesome-Efficient-Reasoning
View on GitHub
Paper list for Efficient Reasoning.
☆899May 29, 2026Updated 2 months ago
iwangjian / pyloader
View on GitHub
🐳 PyLoader: An asynchronous Python dataloader for loading big datasets, supporting PyTorch and TensorFlow 2.x.
☆11Aug 29, 2021Updated 4 years ago
HazyResearch / aioli
View on GitHub
Aioli: A unified optimization framework for language model data mixing
☆33Jan 17, 2025Updated last year
Model-GLUE / Model-GLUE
View on GitHub
☆18Aug 19, 2024Updated last year
princeton-pli / LongProc
View on GitHub
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
☆36Feb 26, 2026Updated 5 months ago
yidingjiang / ado
View on GitHub
The repository contains code for Adaptive Data Optimization
☆37Dec 9, 2024Updated last year