falonss703 / Awesome-Uncertainty-based-Reinforcement-LearningLinks
π₯π₯π₯Latest Papers, Codes on Uncertainty-based RL
β56Updated 4 months ago
Alternatives and similar repositories for Awesome-Uncertainty-based-Reinforcement-Learning
Users that are interested in Awesome-Uncertainty-based-Reinforcement-Learning are comparing it to the libraries listed below
Sorting:
- [AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".β92Updated last month
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.β86Updated 10 months ago
- β345Updated 4 months ago
- [NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"β142Updated 2 months ago
- β59Updated 5 months ago
- Official repository for "CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation"β53Updated last week
- π This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.β317Updated last month
- β294Updated 5 months ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.β403Updated 5 months ago
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoningβ341Updated 3 months ago
- A comprehensive collection of process reward models.β130Updated 2 months ago
- Official Repository of LatentSeekβ71Updated 6 months ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidenceβ10Updated 9 months ago
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGIβ228Updated 2 months ago
- [ICML 2025] Official Implementation of GLIDERβ72Updated 2 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoningβ70Updated 5 months ago
- Resources for the Enigmata Project.β74Updated 4 months ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.β242Updated this week
- β321Updated 7 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".β55Updated last year
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruningβ98Updated 10 months ago
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to β¦β48Updated this week
- β197Updated this week
- [NeurIPS 2025] Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chainsβ65Updated 4 months ago
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"β57Updated 3 months ago
- A Framework for LLM-based Multi-Agent Reinforced Training and Inferenceβ381Updated last month
- π A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyondβ321Updated 2 months ago
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learningβ49Updated 6 months ago
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"β92Updated last year
- Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Modelsβ44Updated 3 months ago