MingyuJ666 / The-Impact-of-Reasoning-Step-Length-on-Large-Language-ModelsLinks
[ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correlation between the effectiveness of CoT and the length of reasoning steps in prompts remains largely unknown. To shed light on this, we have conducted several empirical experiments to explore the relations.
☆46Updated 6 months ago
Alternatives and similar repositories for The-Impact-of-Reasoning-Step-Length-on-Large-Language-Models
Users that are interested in The-Impact-of-Reasoning-Step-Length-on-Large-Language-Models are comparing it to the libraries listed below
Sorting:
- ☆51Updated last year
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆132Updated 8 months ago
- This the implementation of LeCo☆31Updated 10 months ago
- ☆58Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆63Updated last year
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆118Updated 7 months ago
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆63Updated last year
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆34Updated last year
- [NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning☆135Updated 2 months ago
- [NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models☆105Updated last year
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆38Updated last year
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"☆57Updated 2 months ago
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…☆30Updated last year
- ☆69Updated 5 months ago
- ☆134Updated 8 months ago
- A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…☆129Updated last month
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆29Updated last year
- ☆23Updated last year
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆94Updated 7 months ago
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy☆76Updated last month
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆151Updated 5 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆83Updated 8 months ago
- RL Scaling and Test-Time Scaling (ICML'25)☆112Updated 10 months ago
- ☆25Updated 7 months ago
- ☆38Updated 3 months ago
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆31Updated 6 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆63Updated last year
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆70Updated 4 months ago
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆58Updated 5 months ago
- Reasoning Activation in LLMs via Small Model Transfer (NeurIPS 2025)☆20Updated last month