MingyuJ666 / The-Impact-of-Reasoning-Step-Length-on-Large-Language-ModelsLinks
[ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correlation between the effectiveness of CoT and the length of reasoning steps in prompts remains largely unknown. To shed light on this, we have conducted several empirical experiments to explore the relations.
☆46Updated 2 months ago
Alternatives and similar repositories for The-Impact-of-Reasoning-Step-Length-on-Large-Language-Models
Users that are interested in The-Impact-of-Reasoning-Step-Length-on-Large-Language-Models are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆125Updated 4 months ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆61Updated 9 months ago
- ☆59Updated 10 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆38Updated last year
- ☆47Updated 8 months ago
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆80Updated 6 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆146Updated 8 months ago
- Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"☆81Updated 2 weeks ago
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆61Updated 7 months ago
- Code for the 2025 ACL publication "Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs"☆29Updated last month
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆56Updated last week
- This the implementation of LeCo☆31Updated 6 months ago
- ☆116Updated 4 months ago
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…☆29Updated 8 months ago
- Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"☆57Updated last year
- ☆41Updated 8 months ago
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆114Updated last year
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆60Updated last year
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Updated 7 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆74Updated 4 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆179Updated 6 months ago
- The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models"☆31Updated last year
- The rule-based evaluation subset and code implementation of Omni-MATH☆22Updated 7 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 5 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆105Updated 2 months ago
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆27Updated 2 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year
- ☆126Updated 2 months ago
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆135Updated this week