EIT-NLP / Distilling-CoT-ReasoningLinks
[ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".
☆18Updated 5 months ago
Alternatives and similar repositories for Distilling-CoT-Reasoning
Users that are interested in Distilling-CoT-Reasoning are comparing it to the libraries listed below
Sorting:
- ☆48Updated 9 months ago
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆39Updated last month
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆27Updated 2 months ago
- Pitfalls of Rule- and Model-based Verifiers: A Case Study on Mathematical Reasoning.☆22Updated 2 months ago
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆63Updated 9 months ago
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Updated 7 months ago
- ☆140Updated 2 months ago
- ☆39Updated 3 months ago
- ☆40Updated 2 weeks ago
- ☆24Updated 3 months ago
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆27Updated 3 weeks ago
- ☆22Updated last year
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆20Updated last week
- Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping☆52Updated 2 months ago
- ☆23Updated 3 months ago
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…☆38Updated 2 months ago
- ☆67Updated last month
- ☆21Updated 3 months ago
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy☆65Updated 7 months ago
- Model merging is a highly efficient approach for long-to-short reasoning.☆77Updated 2 months ago
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"☆38Updated 3 weeks ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆64Updated 3 weeks ago
- ☆14Updated 7 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆127Updated 4 months ago
- The demo, code and data of FollowRAG☆74Updated last month
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆105Updated 3 months ago
- Official implementation of "MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model". Our co…☆20Updated 7 months ago
- A Sober Look at Language Model Reasoning☆81Updated last month
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆61Updated 9 months ago
- Official repository of paper "Context-DPO: Aligning Language Models for Context-Faithfulness"☆17Updated 5 months ago