falonss703/Awesome-Uncertainty-based-Reinforcement-Learning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/falonss703/Awesome-Uncertainty-based-Reinforcement-Learning)

falonss703 / Awesome-Uncertainty-based-Reinforcement-Learning

🔥🔥🔥Latest Papers, Codes on Uncertainty-based RL

☆58

Alternatives and similar repositories for Awesome-Uncertainty-based-Reinforcement-Learning

Users that are interested in Awesome-Uncertainty-based-Reinforcement-Learning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hhhaaahhhaa / ASR-TTA
View on GitHub
☆16Nov 4, 2025Updated 8 months ago
AI45Lab / DEAN
View on GitHub
☆11Oct 25, 2024Updated last year
QingyangZhang / Label-Free-RLVR
View on GitHub
☆311Jul 6, 2025Updated last year
Kwai-Klear / CE-GPPO
View on GitHub
CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning
☆16Jan 23, 2026Updated 6 months ago
ChnQ / MI-Peaks
View on GitHub
☆68Jul 14, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
RedSearchAgent / DeepTraceHub
View on GitHub
RedSearcher's framework for deep search agent trajectory synthesis, QA filtering, and model evaluation, supporting ReACT and DeepSeek-sty…
☆23Feb 26, 2026Updated 5 months ago
QingyangZhang / TEMPO
View on GitHub
Scaling Test-time Training for LLM Reasoning
☆27Apr 14, 2026Updated 3 months ago
DripNowhy / Sherlock
View on GitHub
[NeurIPS 2025] Official Implementation of paper "Sherlock: Self-Correcting Reasoning in Vision-Language Models"
☆31Jun 4, 2026Updated last month
damanimehul / RLCR
View on GitHub
Official repository for Beyond Binary Rewards: Training LMs to Reason about Their Uncertainty
☆68Aug 20, 2025Updated 11 months ago
NJU-LINK / WebCompass
View on GitHub
The Source Code for WebCompass
☆21May 2, 2026Updated 2 months ago
prnake / kimi-deepresearch
View on GitHub
Kimi K2 Thinking Agentic Search Unofficial Implementation
☆15Nov 9, 2025Updated 8 months ago
OpenIXCLab / CODA
View on GitHub
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning
☆37Aug 28, 2025Updated 11 months ago
FloyedShen / AntiSD
View on GitHub
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information
☆33May 14, 2026Updated 2 months ago
Claw-Eval-Live / Claw-Eval-Live
View on GitHub
☆43Jun 17, 2026Updated last month
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
RUCBM / LaSeR
View on GitHub
[ICLR 2026] Official repository for the paper "LaSeR: Reinforcement Learning with Last-Token Self-Rewarding"
☆36Oct 28, 2025Updated 9 months ago
yayayacc / MUR
View on GitHub
☆49May 14, 2026Updated 2 months ago
Osilly / Awesome-Interleaving-Reasoning
View on GitHub
Interleaving Reasoning: Next-Generation Reasoning Systems for AGI
☆281Jun 5, 2026Updated last month
shivamag125 / EM_PT
View on GitHub
☆33Aug 21, 2025Updated 11 months ago
longmalongma / TW-GRPO
View on GitHub
The official repository of our paper "Reinforcing Video Reasoning with Focused Thinking"
☆36Jun 12, 2025Updated last year
QingyangZhang / EMPO
View on GitHub
[NeurIPS25 Spotlight] EMPO, A Fully Unsupervised RLVR Method
☆103Nov 24, 2025Updated 8 months ago
suu990901 / KlearReasoner
View on GitHub
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
☆82Dec 25, 2025Updated 7 months ago
technion-cs-nlp / hallucination-mitigation
View on GitHub
☆23Dec 17, 2024Updated last year
EffiVLM-Bench / EffiVLM-Bench
View on GitHub
☆35Jun 3, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
WisdomShell / RewardAnything
View on GitHub
RewardAnything: Generalizable Principle-Following Reward Models
☆44Jun 11, 2025Updated last year
BlueWhaleLab / DCScore
View on GitHub
☆13May 23, 2025Updated last year
voidful / MMLM
View on GitHub
Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra
☆16Dec 10, 2024Updated last year
HKUST-KnowComp / NAACL
View on GitHub
The official codebase for our paper "NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems"
☆24Feb 28, 2026Updated 5 months ago
MeiGen-AI / PosterReward
View on GitHub
[CVPR2026] PosterReward: Unlocking Accurate Evaluation for High-Quality Graphic Design Generation
☆32Apr 2, 2026Updated 3 months ago
Coobiw / IE-Critic-R1
View on GitHub
IE-Critic-R1: Advancing the Explanatory Measurement of Text-Driven Image Editing for Human Perception Alignment
☆19Nov 26, 2025Updated 8 months ago
HHYHRHY / OWMM-Agent
View on GitHub
[NeurIPS'2025] "OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis"
☆30Dec 4, 2025Updated 7 months ago
DocTron-hub / VinciCoder
View on GitHub
☆42Jan 9, 2026Updated 6 months ago
ritaranx / AceSearcher
View on GitHub
This is the code repo for the paper AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play (NeurIPS 2025 Spotl…
☆25Sep 29, 2025Updated 10 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
InternLM / Visual-ERM
View on GitHub
Official Implementation of "Visual-ERM: Reward Modeling for Visual Equivalence"
☆64Mar 23, 2026Updated 4 months ago
mt-cly / ViP3DEdit
View on GitHub
[AAAI26] ViP3DE: Fast Multi-view Consistent 3D Editing with Video Priors
☆22Mar 5, 2026Updated 4 months ago
L-O-I / RRVF
View on GitHub
☆18Aug 7, 2025Updated 11 months ago
cor3bit / bertsekas-marl
View on GitHub
PyTorch Implementation of the Sequential Multiagent Rollout algorithm
☆11Jun 28, 2024Updated 2 years ago
ChnQ / LLM4Mol
View on GitHub
Code implementation for paper "Can Large Language Models Empower Molecular Property Prediction?"
☆39Jul 14, 2023Updated 3 years ago
joey-wang123 / CL-refresh-learning
View on GitHub
A Unified and General Framework for Continual Learning, ICLR 2024
☆15Mar 22, 2024Updated 2 years ago
QingyangZhang / awesome-low-quality-multimodal-learning
View on GitHub
☆54Dec 30, 2024Updated last year