NUS-TRAIL / SynthRLLinks
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis
☆66Updated 2 months ago
Alternatives and similar repositories for SynthRL
Users that are interested in SynthRL are comparing it to the libraries listed below
Sorting:
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆134Updated 3 months ago
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…☆46Updated 3 months ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆44Updated this week
- [NeurIPS 2025🔥]Main source code of SRPO framework.☆83Updated this week
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆62Updated 2 months ago
- Official Repository of LatentSeek☆60Updated 3 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆95Updated 9 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆91Updated last week
- ☆332Updated last month
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆146Updated 3 months ago
- ☆167Updated 4 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆46Updated 11 months ago
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆52Updated last month
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆69Updated 2 months ago
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆69Updated 5 months ago
- ☆46Updated 5 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆36Updated 5 months ago
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆80Updated 3 months ago
- ☆125Updated 6 months ago
- ☆101Updated this week
- Large Language Models Can Self-Improve in Long-context Reasoning☆73Updated 10 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆86Updated 7 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆47Updated 2 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆178Updated 6 months ago
- Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"☆35Updated 3 months ago
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆82Updated 4 months ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆84Updated 7 months ago
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆70Updated last year
- Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents☆184Updated 4 months ago
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆59Updated 9 months ago