Shenzhi-Wang/Beyond-the-80-20-Rule-RLVR

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Shenzhi-Wang/Beyond-the-80-20-Rule-RLVR)

Shenzhi-Wang / Beyond-the-80-20-Rule-RLVR

The open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning."

☆61

Alternatives and similar repositories for Beyond-the-80-20-Rule-RLVR

Users that are interested in Beyond-the-80-20-Rule-RLVR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LeapLabTHU / UniTTA
View on GitHub
☆21Mar 5, 2025Updated last year
SHI-Labs / IMG-Multimodal-Diffusion-Alignment
View on GitHub
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance, ICCV 2025
☆30Oct 1, 2025Updated 9 months ago
LeapLabTHU / InsightTok
View on GitHub
InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation
☆38May 15, 2026Updated 2 months ago
LeapLabTHU / RvR
View on GitHub
🔥 Regeneration over editing: unlocking more effective image refinement!
☆52May 26, 2026Updated 2 months ago
LeapLabTHU / WeightFormer
View on GitHub
☆20Jun 6, 2026Updated last month
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
LeapLabTHU / diver-ct
View on GitHub
☆14Dec 19, 2024Updated last year
LR32768 / DL_theory_exp
View on GitHub
☆16Apr 12, 2024Updated 2 years ago
LeapLabTHU / AdaNAT
View on GitHub
[ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
☆37Sep 12, 2024Updated last year
star9988rr / VIPScene
View on GitHub
☆37Dec 2, 2025Updated 7 months ago
yueyang130 / SEEM
View on GitHub
Official code of paper Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL
☆24Oct 30, 2023Updated 2 years ago
LeapLabTHU / SimPro
View on GitHub
[ICML 2024] SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning
☆31Sep 30, 2024Updated last year
LeapLabTHU / Dynamic_Perceiver
View on GitHub
Official implementation of Dynamic Perceiver
☆44Nov 16, 2023Updated 2 years ago
LeapLabTHU / AdaptiveNN-Jittor
View on GitHub
☆33May 27, 2026Updated 2 months ago
LeapLabTHU / CODA
View on GitHub
CODA: Repurposing Continuous VAEs for Discrete Tokenization
☆37Jul 4, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
LeapLabTHU / ENAT
View on GitHub
[NeurIPS 2024] ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
☆25Nov 28, 2024Updated last year
LeapLabTHU / DAT-Jittor
View on GitHub
Jittor implementation of Vision Transformer with Deformable Attention
☆32Mar 1, 2022Updated 4 years ago
LeapLabTHU / LAUDNet
View on GitHub
[IEEE TPAMI] Latency-aware Unified Dynamic Networks for Efficient Image Recognition
☆53Mar 20, 2025Updated last year
LeapLabTHU / limit-of-RLVR
View on GitHub
repo for paper https://arxiv.org/abs/2504.13837
☆346Dec 17, 2025Updated 7 months ago
wizard-III / ArcherCodeR
View on GitHub
ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …
☆44Aug 6, 2025Updated 11 months ago
LeapLabTHU / FamO2O
View on GitHub
Repository of "Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning" (NeurIPS 2023 Spotlight)
☆41Oct 30, 2023Updated 2 years ago
LeapLabTHU / AdaptiveNN
View on GitHub
[Nature Machine Intelligence 2025] Emulating Human-like Adaptive Vision for Efficient and Flexible Machine Visual Perception
☆154May 27, 2026Updated 2 months ago
LeapLabTHU / JustGRPO
View on GitHub
[ICML 2026 Outstanding Paper] Minimalist RL for Diffusion LLMs. 89.1% on GSM8K.
☆252Jul 6, 2026Updated 3 weeks ago
shihao1895 / SpatialActor
View on GitHub
[AAAI 2026 Oral] SpatialActor: Exploring Disentangled Spatial Representations for Robust Robotic Manipulation
☆62Jun 13, 2026Updated last month
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
LeapLabTHU / Uni-AdaFocus
View on GitHub
Official repository of Uni-AdaFocus (TPAMI 2024).
☆59Dec 17, 2024Updated last year
LeapLabTHU / ProCo
View on GitHub
[TPAMI 2024] Probabilistic Contrastive Learning for Long-Tailed Visual Recognition
☆95Sep 30, 2024Updated last year
Andrewzh112 / AI-Research-Interview-Lab
View on GitHub
☆31Nov 14, 2025Updated 8 months ago
Andrewzh112 / ExpeL
View on GitHub
☆14Dec 16, 2023Updated 2 years ago
LeapLabTHU / MOSS
View on GitHub
Official implementation of A Mixture of Surprises for Unsupervised Reinforcement Learning
☆23Nov 16, 2022Updated 3 years ago
LeapLabTHU / AdaFocusV2
View on GitHub
[CVPR 2022] Official repository of AdaFocusV2.
☆91Dec 15, 2024Updated last year
wizard-III / Archer2.0
View on GitHub
Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature converg…
☆31Oct 10, 2025Updated 9 months ago
beanie00 / self-distillation-analysis
View on GitHub
Codebase for the work “Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?”
☆75Apr 14, 2026Updated 3 months ago
LeapLabTHU / Attention-Mediators
View on GitHub
[ECCV 2024] Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
☆47Sep 11, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ZhangXJ199 / EDGE-GRPO
View on GitHub
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
☆22Aug 28, 2025Updated 11 months ago
THUDM / TreeRL
View on GitHub
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
☆99Jun 16, 2025Updated last year
PRIME-RL / Entropy-Mechanism-of-RL
View on GitHub
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
☆446Jul 11, 2025Updated last year
LeapLabTHU / Deep-Incubation
View on GitHub
Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)
☆92Mar 16, 2023Updated 3 years ago
GBATZOLIS / BitstreamDiffusion
View on GitHub
☆16Jul 22, 2026Updated last week
LeapLabTHU / AdaAFforPINNs
View on GitHub
☆19Aug 9, 2023Updated 2 years ago
lili-chen / rltf
View on GitHub
Reinforcement Learning from Text Feedback
☆48Feb 17, 2026Updated 5 months ago