EIT-NLP / Distilling-CoT-ReasoningLinks
[ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning". (By Xinghao Chen)
☆15Updated 3 months ago
Alternatives and similar repositories for Distilling-CoT-Reasoning
Users that are interested in Distilling-CoT-Reasoning are comparing it to the libraries listed below
Sorting:
- ☆46Updated 7 months ago
- ☆28Updated last month
- ☆13Updated 10 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 3 months ago
- Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping☆41Updated 2 weeks ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆60Updated 5 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆98Updated last month
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆30Updated 2 weeks ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆21Updated last week
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆35Updated 3 weeks ago
- Official Implementation for the paper "Integrative Decoding: Improving Factuality via Implicit Self-consistency"☆26Updated last month
- [ACL 2024] Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models☆21Updated 10 months ago
- MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion (ACL 2025)☆22Updated last week
- Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)☆32Updated 11 months ago
- [EMNLP 2024 Main] Official implementation of the paper "The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Langua…☆14Updated 6 months ago
- [ACL 2025 (Findings)] DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling☆14Updated 5 months ago
- ☆19Updated last month
- ☆19Updated 3 weeks ago
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆63Updated 7 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆38Updated last year
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Updated 5 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆54Updated 6 months ago
- ☆22Updated 11 months ago
- ☆35Updated 3 months ago
- The official code repository for PRMBench.☆73Updated 3 months ago
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy☆61Updated 5 months ago
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs☆27Updated 2 weeks ago
- ☆45Updated last month
- ☆24Updated last month
- DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails☆23Updated 3 months ago