limenlp/verl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/limenlp/verl)

limenlp / verl

AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

☆56

Alternatives and similar repositories for verl

Users that are interested in verl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

limenlp / SEA
View on GitHub
Official Implementation for the paper "Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base"
☆27Sep 2, 2025Updated 10 months ago
ServiceNow / sec
View on GitHub
☆16Jul 10, 2025Updated last year
Zanette-Labs / speed-rl
View on GitHub
☆18Feb 2, 2026Updated 5 months ago
tianyi-lab / C3PO
View on GitHub
[COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"
☆21Apr 9, 2025Updated last year
wumingqi / LLM-Math-Evaluation
View on GitHub
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.
☆21Jul 18, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
ASTRAL-Group / data-efficient-llm-rl
View on GitHub
☆45Jan 16, 2026Updated 6 months ago
WilliamZR / ProTrix
View on GitHub
Code for ProTrix: Building Models for Planning and Reasoning over Tables with Sentence Context
☆17Nov 15, 2024Updated last year
MingLiiii / Gradient_Unified
View on GitHub
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients
☆20Jun 17, 2025Updated last year
elated-sawyer / WALL-E
View on GitHub
Official code for the paper: WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
☆63Dec 3, 2025Updated 7 months ago
CodeLLM-Research / CodeJudge-Eval
View on GitHub
[COLING25] CodeJudge Eval: Can Large Language Models be Good Judges in Code Understanding?
☆12Dec 3, 2024Updated last year
zxiangx / LC-R1
View on GitHub
Code for paper: Optimizing Length Compression in Large Reasoning Models
☆29Oct 20, 2025Updated 9 months ago
tianyi-lab / CoSTAR
View on GitHub
Cost-Sensitive Toolpath Agent for Multi-turn Image Editing
☆31Mar 26, 2025Updated last year
shijian2001 / TemplateMatters
View on GitHub
A programmatic instruction template generator aiming at enhancing the understanding of the critical role instruction templates play in la…
☆15Dec 22, 2024Updated last year
tianyi-lab / FaSTAR
View on GitHub
[ICLR 2026] Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing
☆33May 30, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
likaixin2000 / MMCode
View on GitHub
[EMNLP 2024] Multi-modal reasoning problems via code generation.
☆28Apr 14, 2026Updated 3 months ago
ZhentingWang / DUMP
View on GitHub
☆33May 9, 2025Updated last year
IBM / ColPret
View on GitHub
Efficient Scaling laws and collaborative pretraining.
☆23Updated this week
liziniu / GEM
View on GitHub
Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)
☆58May 12, 2025Updated last year
tianyi-lab / R2-T2
View on GitHub
[ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"
☆19Mar 10, 2025Updated last year
SalesforceAIResearch / CoAct-1
View on GitHub
CoAct-1: Computer-using Agents with Coding as Actions
☆27Jun 2, 2026Updated last month
tianyi-lab / MiP-Overthinking
View on GitHub
[COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
☆39Jun 5, 2025Updated last year
RLHFlow / Minimal-RL
View on GitHub
☆275May 14, 2025Updated last year
wuxiyang1996 / AutoHallusion
View on GitHub
AutoHallusion Codebase (EMNLP 2024)
☆23Dec 6, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
tianyi-lab / MoE-Embedding
View on GitHub
[ICLR 2025 Oral] "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"
☆92Oct 15, 2024Updated last year
bigai-nlco / CREAM
View on GitHub
[NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding
☆22Oct 10, 2024Updated last year
WenyiWU0111 / CoMEM-Agent
View on GitHub
Official repository for paper Auto-scaling Continuous Memory for GUI Agent
☆29Feb 2, 2026Updated 5 months ago
ypwang61 / One-Shot-RLVR
View on GitHub
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆444Mar 11, 2026Updated 4 months ago
measure-infinity / mulan-code
View on GitHub
☆43Jul 16, 2024Updated 2 years ago
limenlp / safer-instruct
View on GitHub
This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"
☆17Feb 22, 2024Updated 2 years ago
waltonfuture / RL-with-Cold-Start
View on GitHub
SFT+RL boosts multimodal reasoning
☆47Jun 27, 2025Updated last year
tianyi-lab / DisCL
View on GitHub
[ICCV 2025] Diffusion Curriculum (DisCL)
☆18Sep 26, 2025Updated 9 months ago
divyakraman / AerialDiffusion
View on GitHub
Codebase for the paper Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models
☆13Oct 3, 2023Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
kai-wen-yang / IDAA
View on GitHub
[ICML2022] "Identity-Disentangled Adversarial Augmentation for Self-Supervised Learning"
☆10Jul 24, 2022Updated 4 years ago
sail-sg / feedback-conditional-policy
View on GitHub
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆65Jan 5, 2026Updated 6 months ago
tianyi-lab / RuleR
View on GitHub
[NAACL'25] RuleR: Improving LLM Controllability by Rule-based Data Recycling
☆14Sep 27, 2025Updated 9 months ago
Yibin-Lei / MetaEOL
View on GitHub
Implementation for ACL 2024 paper "Meta-Task Prompting Elicits Embeddings from Large Language Models"
☆12Jul 25, 2024Updated 2 years ago
divelab / E2H-Reasoning
View on GitHub
[ICLR' 26] Implementation of "Curriculum Reinforcement Learning from Easy to Hard Tasks Improves LLM Reasoning"
☆24May 28, 2026Updated last month
ernie-research / CD-RLHF
View on GitHub
[ACL'25] Official code of curiosity-driven RLHF
☆16Jun 22, 2025Updated last year
wizard-III / ArcherCodeR
View on GitHub
ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …
☆44Aug 6, 2025Updated 11 months ago