thu-nics / TaHLinks
Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"
☆34Updated this week
Alternatives and similar repositories for TaH
Users that are interested in TaH are comparing it to the libraries listed below
Sorting:
- ☆51Updated 9 months ago
- A Sober Look at Language Model Reasoning☆89Updated last week
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆180Updated 4 months ago
- ☆17Updated 3 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆97Updated 11 months ago
- The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Updated 2 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆83Updated 8 months ago
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆49Updated last month
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated 2 months ago
- ☆103Updated 2 months ago
- ☆104Updated last month
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆25Updated 3 months ago
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning☆53Updated last month
- Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆27Updated 2 months ago
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆34Updated 4 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆122Updated 8 months ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆123Updated 7 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆89Updated last year
- ☆30Updated last week
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆97Updated 2 months ago
- Process Reward Models That Think☆63Updated last month
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆94Updated 7 months ago
- ☆32Updated 6 months ago
- [NeurIPS 2025 Spotlight] ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆133Updated 2 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆65Updated 6 months ago
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆79Updated 3 weeks ago
- A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…☆128Updated last month
- ☆131Updated 8 months ago
- ☆49Updated last month
- ☆45Updated 2 months ago