ZJU-REAL / TimeHC-RLLinks
This repository is the official implementation of TimeHC-RL (Distilabel (Data Generation) + TRL (SFT) + VeRL (GRPO)).
☆47Updated 5 months ago
Alternatives and similar repositories for TimeHC-RL
Users that are interested in TimeHC-RL are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] Code for Let LLMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆50Updated last month
- GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts☆35Updated last month
- ☆36Updated last month
- [NeurIPS 2025] Mind the Gap: Bridging Thought Leap for Improved CoT Tuning https://arxiv.org/abs/2505.14684☆44Updated 2 weeks ago
- Code for Paper InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models☆38Updated 3 months ago
- A Unified Framework for High-Performance and Extensible LLM Steering☆100Updated this week
- ☆24Updated 2 months ago
- ☆29Updated 2 months ago
- This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉☆24Updated 5 months ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆85Updated 3 months ago
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆30Updated 3 months ago
- Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"☆94Updated 2 months ago
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆46Updated 4 months ago
- ☆44Updated 5 months ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆47Updated last month
- ☆98Updated 10 months ago
- Official Repository of LatentSeek☆66Updated 5 months ago
- Data and Code for CVPR 2025 paper "MMVU: Measuring Expert-Level Multi-Discipline Video Understanding"☆75Updated 8 months ago
- Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"☆33Updated last year
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 9 months ago
- ☆18Updated 2 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆95Updated last month
- ☆84Updated last year
- ✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio☆50Updated 3 months ago
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆19Updated 8 months ago
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆40Updated 2 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆109Updated 5 months ago
- Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?☆36Updated 4 months ago
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆83Updated 9 months ago
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆26Updated 5 months ago