NUS-TRAIL / SynthRLLinks

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

☆68

Alternatives and similar repositories for SynthRL

Users that are interested in SynthRL are comparing it to the libraries listed below

Sorting:

RM-R1-UIUC / RM-R1
RM-R1: Unleashing the Reasoning Potential of Reward Models
☆156Updated 7 months ago
hkust-nlp / mstar
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆70Updated 6 months ago
kokolerk / TON
[NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
☆53Updated 4 months ago
SihengLi99 / SEALONG
Large Language Models Can Self-Improve in Long-context Reasoning
☆72Updated last year
xuyige / SoftCoT
ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…
☆76Updated 8 months ago
zjunlp / LightThinker
[EMNLP 2025] LightThinker: Thinking Step-by-Step Compression
☆131Updated 9 months ago
TEAM-ARM / arm
[NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model
☆64Updated 3 months ago
ReasoningTransfer / Transferability-of-LLM-Reasoning
☆108Updated last month
Dereck0602 / Awesome_Test_Time_LLMs
☆141Updated 10 months ago
dongxiangjue / Awesome-LLM-Self-Improvement
A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …
☆99Updated last year
TIGER-AI-Lab / VL-Rethinker
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆179Updated 7 months ago
THU-KEG / RM-Bench
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆73Updated 6 months ago
ruixin31 / Spurious_Rewards
☆352Updated 6 months ago
Joshua-Ren / Learning_dynamics_LLM
☆204Updated last month
SihengLi99 / LLM-Honesty-Survey
[2025-TMLR] A Survey on the Honesty of Large Language Models
☆64Updated last year
bigai-nlco / LatentSeek
Official Repository of LatentSeek
☆76Updated 7 months ago
sail-sg / AnytimeReasoner
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
☆51Updated 6 months ago
HZQ950419 / Math-LLaVA
Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models
☆92Updated last year
yafuly / TPO
Test-time preferenece optimization (ICML 2025).
☆178Updated 8 months ago
GAIR-NLP / OctoThinker
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆182Updated 6 months ago
Gen-Verse / GenEnv
☆45Updated last month
shawnricecake / Heima
Code for Heima
☆59Updated 9 months ago
horseee / CoT-Valve
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
☆89Updated 11 months ago
THU-KEG / AdaptThink
☆177Updated last month
ltzheng / SimpleTIR
[ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆351Updated 2 weeks ago
RUCBM / DeepCritic
Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"
☆41Updated 7 months ago
NineAbyss / S2R
This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"
☆73Updated 9 months ago
luka-group / vlm-knowledge-conflict
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆51Updated last year
TianHongZXY / RLVR-Decomposed
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆157Updated 3 months ago
multimodal-art-projection / LatentCoT-Horizon
📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.
☆349Updated 2 months ago