StarDewXXX / O1-Pruner
Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
☆64Updated last month
Alternatives and similar repositories for O1-Pruner:
Users that are interested in O1-Pruner are comparing it to the libraries listed below
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆162Updated 2 weeks ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆103Updated 2 weeks ago
- ☆49Updated last month
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆107Updated last week
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆46Updated 4 months ago
- ☆82Updated 2 weeks ago
- A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond☆42Updated this week
- ☆129Updated last week
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆43Updated 3 weeks ago
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆63Updated last month
- Repo of paper "Free Process Rewards without Process Labels"☆138Updated 2 weeks ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆67Updated last week
- ☆171Updated last month
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆52Updated 4 months ago
- ☆85Updated 3 weeks ago
- ☆107Updated last month
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆78Updated 3 weeks ago
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆84Updated last month
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆100Updated 3 months ago
- ☆54Updated 5 months ago
- The code of RouterDC☆56Updated last month
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆171Updated last week
- A Survey on Efficient Reasoning for LLMs☆204Updated this week
- official implementation of paper "Process Reward Model with Q-value Rankings"☆51Updated last month
- This repository contains the code and data for the paper "SelfIE: Self-Interpretation of Large Language Model Embeddings" by Haozhe Chen,…☆48Updated 3 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆107Updated last year
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆64Updated last week
- ☆138Updated 2 weeks ago
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆41Updated 3 months ago
- ☆61Updated 3 months ago