MingyuJ666 / The-Impact-of-Reasoning-Step-Length-on-Large-Language-ModelsLinks
[ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correlation between the effectiveness of CoT and the length of reasoning steps in prompts remains largely unknown. To shed light on this, we have conducted several empirical experiments to explore the relations.
☆45Updated 4 months ago
Alternatives and similar repositories for The-Impact-of-Reasoning-Step-Length-on-Large-Language-Models
Users that are interested in The-Impact-of-Reasoning-Step-Length-on-Large-Language-Models are comparing it to the libraries listed below
Sorting:
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆61Updated 11 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆128Updated 6 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆151Updated 11 months ago
- ☆50Updated 11 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆38Updated last year
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…☆29Updated 10 months ago
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"☆43Updated 2 weeks ago
- This the implementation of LeCo☆31Updated 8 months ago
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆29Updated last year
- ☆59Updated last year
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆120Updated last year
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆61Updated 9 months ago
- [NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"☆109Updated last month
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆82Updated 6 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆111Updated 4 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆186Updated 8 months ago
- Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"☆59Updated last year
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆30Updated 4 months ago
- ☆26Updated last year
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆134Updated 3 months ago
- ☆121Updated last month
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆104Updated 5 months ago
- Code for the 2025 ACL publication "Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs"☆33Updated 3 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆120Updated last year
- This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…☆26Updated last year
- ☆26Updated 5 months ago
- [ACL 2024] Code for the paper "ALaRM: Align Language Models via Hierarchical Rewards Modeling"☆25Updated last year
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆62Updated 2 months ago
- ☆67Updated 3 months ago
- This is for EMNLP 2024 Paper: AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction☆13Updated 10 months ago