[NeurIPS 2025] The implementation of paper "On Reasoning Strength Planning in Large Reasoning Models"
☆30Jul 6, 2025Updated 7 months ago
Alternatives and similar repositories for LRM-plans-CoT
Users that are interested in LRM-plans-CoT are comparing it to the libraries listed below
Sorting:
- ☆27Jul 18, 2025Updated 7 months ago
- SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis☆68Jul 24, 2025Updated 7 months ago
- ☆32Oct 13, 2025Updated 4 months ago
- ☆21Feb 22, 2026Updated last week
- The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models sca…☆45Nov 6, 2025Updated 3 months ago
- Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models☆45Sep 19, 2025Updated 5 months ago
- RL with Experience Replay☆55Jul 27, 2025Updated 7 months ago
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆97Feb 21, 2025Updated last year
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 6 months ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence☆10Mar 2, 2025Updated last year
- [AAAI26] Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilitie…☆10Feb 7, 2026Updated 3 weeks ago
- Agent Memory Playground: AI Agent Memory Design & Optimization Techniques☆30Aug 7, 2025Updated 6 months ago
- ☆38Feb 20, 2026Updated last week
- EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation☆27Jul 30, 2025Updated 7 months ago
- ☆18Mar 2, 2025Updated last year
- Official implementation for Text Generation Beyond Discrete Token Sampling☆21Aug 11, 2025Updated 6 months ago
- [EMNLP 2025 Main] AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time☆88Jun 10, 2025Updated 8 months ago
- Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"☆12Mar 25, 2025Updated 11 months ago
- ☆13Mar 9, 2024Updated last year
- Korean Benchmark for Korean Legal Language Understanding☆16Nov 16, 2024Updated last year
- An Ultra-Long Output Reinforcement Learning Approach☆23Jul 31, 2025Updated 7 months ago
- The official github repo for MixEval-X, the first any-to-any, real-world benchmark.☆16Feb 15, 2025Updated last year
- This repository contains the code for the paper: Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models☆20Apr 27, 2024Updated last year
- The official implementation of dLLM-Var☆30Nov 6, 2025Updated 3 months ago
- ☆12Sep 1, 2023Updated 2 years ago
- Code and Data for ACL 2025 Paper "Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework".☆23Oct 3, 2025Updated 5 months ago
- Reinforcing General Reasoning without Verifiers☆95Jun 24, 2025Updated 8 months ago
- ☆179Dec 5, 2025Updated 2 months ago
- Code for “SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation(ICLR 2025)”☆24Oct 23, 2025Updated 4 months ago
- [ICLR 2026] Quantile Advantage Estimation for Entropy-Safe Reasoning☆23Oct 14, 2025Updated 4 months ago
- 💻 Terminal-Agent with Human-in-the-Loop Learning☆34Jan 16, 2026Updated last month
- ☆11Oct 5, 2024Updated last year
- Artifact evaluation for HPCA'24 paper Lightening-Transformer: A Dynamically-operated Optically-interconnected Photonic Transformer Accele…☆11Mar 3, 2024Updated 2 years ago
- Optimizing Review Generation Through Prompt Generation☆15Apr 15, 2024Updated last year
- Inducing Point Operator Transformer: A Flexible and Scalable Architecture for Solving PDEs (AAAI 2024)☆15Jul 30, 2024Updated last year
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆18Apr 16, 2025Updated 10 months ago
- ☆15Nov 7, 2024Updated last year
- Customized Inference Engine for Multiverse Models☆24Jun 27, 2025Updated 8 months ago
- ☆13Jan 15, 2025Updated last year