YujunZhou/EVOL-RL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/YujunZhou/EVOL-RL)

YujunZhou / EVOL-RL

Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).

☆51

Alternatives and similar repositories for EVOL-RL

Users that are interested in EVOL-RL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Chengsong-Huang / G-Zero
View on GitHub
☆25May 14, 2026Updated 2 months ago
zhengkid / Parallel-Probe
View on GitHub
The offical repo for "Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing"
☆19Feb 3, 2026Updated 5 months ago
Chengsong-Huang / RelayLLM
View on GitHub
☆40Jan 10, 2026Updated 6 months ago
zhengkid / Parallel-R1
View on GitHub
The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"
☆260Feb 4, 2026Updated 5 months ago
zhengkid / Parallel_Thinking_via_MoT
View on GitHub
Official Code for "Learning to Reason via Mixture-of-Thought for Logical Reasoning"
☆29Nov 20, 2025Updated 8 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
MasterVito / SwS
View on GitHub
Official Repo for SwS: A Weakness-driven Problem Synthesis Framework in RL for LLM Reasoning
☆42Nov 11, 2025Updated 8 months ago
tmlr-group / Co-rewarding
View on GitHub
[ICLR 2026] "Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models"
☆58Feb 4, 2026Updated 5 months ago
Leey21 / A-Data-Centric-Study
View on GitHub
☆18Mar 2, 2026Updated 4 months ago
TIGER-AI-Lab / General-Reasoner
View on GitHub
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆229Nov 27, 2025Updated 8 months ago
ritzz-ai / PACS
View on GitHub
☆31Sep 12, 2025Updated 10 months ago
kaiwenzha / RL-Tango
View on GitHub
[NeurIPS 2025] RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning
☆57Oct 23, 2025Updated 9 months ago
sastpg / CoVo
View on GitHub
Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
☆25Jun 25, 2025Updated last year
LongHorizonReasoning / h1
View on GitHub
☆26Oct 29, 2025Updated 8 months ago
PRIME-RL / TTRL
View on GitHub
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
☆1,103Apr 15, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
waltonfuture / MM-UPT
View on GitHub
[NeurIPS 2025] First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training
☆88Oct 29, 2025Updated 9 months ago
CSSLab / ThinkTwice
View on GitHub
Jointly Optimizing Large Language Models for Reasoning and Self-Refinement
☆15Apr 22, 2026Updated 3 months ago
bruno686 / VisPlay
View on GitHub
[CVPR'26] VisPlay: Self-Evolving Vision-Language Models
☆65Feb 25, 2026Updated 5 months ago
shiqichen17 / SPA
View on GitHub
Github repository for "Internalizing World Models via Self-Play Finetuning for Agentic RL"
☆36Nov 1, 2025Updated 8 months ago
JingMog / THOR
View on GitHub
[ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".
☆33Feb 26, 2026Updated 5 months ago
Chengsong-Huang / R-Zero
View on GitHub
[ICLR2026] codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)
☆825Feb 4, 2026Updated 5 months ago
weiyifan1023 / AutoTIR
View on GitHub
Code and Data for Paper "AutoTIR: Autonomous Tools Integrated Reasoning via Reinforcement Learning"
☆54Sep 4, 2025Updated 10 months ago
leroy9472 / InMind
View on GitHub
☆15Nov 18, 2025Updated 8 months ago
tianyi-lab / RoMA
View on GitHub
Code for "Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs"
☆19Nov 6, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
zjunlp / KnowRL
View on GitHub
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
☆48May 19, 2026Updated 2 months ago
MasterVito / SvS
View on GitHub
Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training
☆54Dec 13, 2025Updated 7 months ago
hkproj / retrieval-augmented-generation-notes
View on GitHub
Slides for "Retrieval Augmented Generation" video
☆27Nov 27, 2023Updated 2 years ago
BaohaoLiao / frac-cot
View on GitHub
[COLM 2026] An efficient 3D sampling method for long-CoT LLM.
☆16May 25, 2025Updated last year
zhangxy-2019 / critique-GRPO
View on GitHub
[ICML 2026 Spotlight] Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback
☆70Jun 3, 2026Updated last month
kosonocky / CheF
View on GitHub
☆14Apr 16, 2024Updated 2 years ago
NJUNLP / AdaR
View on GitHub
☆15Dec 8, 2025Updated 7 months ago
MingLiiii / ThinkARM
View on GitHub
Schoenfeld’s Anatomy of Mathematical Reasoning by Language Models
☆27Dec 21, 2025Updated 7 months ago
tianyi-lab / C3PO
View on GitHub
[COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"
☆21Apr 9, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
QingyangZhang / Label-Free-RLVR
View on GitHub
☆311Jul 6, 2025Updated last year
LiaoMengqi / E3-RL4LLMs
View on GitHub
[ EMNLP 2025 Main ] Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs
☆17Nov 7, 2025Updated 8 months ago
Simplified-Reasoning / LUFFY
View on GitHub
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆461Mar 20, 2026Updated 4 months ago
facebookresearch / darling
View on GitHub
Official Implementation of the paper "Jointly Reinforcing Diversity and Quality in Language Model Generations"
☆61May 8, 2026Updated 2 months ago
wantbook-book / SeRL
View on GitHub
SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data
☆24Jan 24, 2026Updated 6 months ago
SongW-SW / CEB
View on GitHub
☆15Jun 25, 2025Updated last year
uq-project / UQ
View on GitHub
UQ: Assessing Language Models on Unsolved Questions
☆30Aug 26, 2025Updated 11 months ago