ZJU-REAL / Mind-the-GapLinks
Mind the Gap: Bridging Thought Leap for Improved CoT Tuning https://arxiv.org/abs/2505.14684
☆37Updated last month
Alternatives and similar repositories for Mind-the-Gap
Users that are interested in Mind-the-Gap are comparing it to the libraries listed below
Sorting:
- Code for Let LLMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆41Updated 2 weeks ago
- ViewSpatial-Bench:Evaluating Multi-perspective Spatial Localization in Vision-Language Models☆48Updated 3 weeks ago
- This repository is the official implementation of TimeHC-RL (Distilabel (Data Generation) + TRL (SFT) + VeRL (GRPO)).☆45Updated 3 weeks ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆18Updated last week
- Code for Paper InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models☆28Updated 3 weeks ago
- A Self-Training Framework for Vision-Language Reasoning☆80Updated 5 months ago
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆33Updated last month
- ☆62Updated last week
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆54Updated 2 months ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆125Updated last week
- ☆46Updated 2 months ago
- A comprehensive collection of process reward models.☆92Updated 2 weeks ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆54Updated 6 months ago
- ☆19Updated last month
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆73Updated 4 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆75Updated 3 weeks ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆113Updated 2 months ago
- ☆74Updated last year
- [ACL 2025] A Neural-Symbolic Self-Training Framework☆109Updated 3 weeks ago
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆30Updated last week
- Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?☆28Updated 3 weeks ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆62Updated 3 weeks ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆191Updated last week
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 4 months ago
- This repository will continuously update the latest papers, technical reports, benchmarks about multimodal reasoning!☆45Updated 3 months ago
- The demo, code and data of FollowRAG☆73Updated 2 months ago
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆136Updated last week
- ☆43Updated 3 months ago
- The official implementation of SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning☆15Updated last month
- Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents☆116Updated last month