agentica-project / verl-pipelineLinks

Async pipelined version of Verl

☆123

Alternatives and similar repositories for verl-pipeline

Users that are interested in verl-pipeline are comparing it to the libraries listed below

Sorting:

princeton-nlp / ProLong
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
☆236Updated last month
yaof20 / Flash-RL
Implementation for FP8/INT8 Rollout for RL training without performence drop.
☆261Updated last month
R2E-Gym / R2E-Gym
[COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents
☆178Updated 3 months ago
CMU-AIRe / MRT
Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".
☆114Updated 2 months ago
ISEEKYAN / verl_megatron_practice
(best/better) practices of megatron on veRL and tuning guide
☆98Updated last month
hkust-nlp / dart-math
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
☆115Updated 10 months ago
thu-wyz / inference_scaling
☆75Updated 11 months ago
ganler / code-r1
Reproducing R1 for Code with Reliable Rewards
☆262Updated 5 months ago
ISEEKYAN / mbridge
Bridge Megatron-Core to Hugging Face/Reinforcement Learning
☆142Updated last week
hkust-nlp / llm-compression-intelligence
Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]
☆142Updated last year
Glaciohound / LM-Infinite
Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆149Updated 7 months ago
eddycmu / demystify-long-cot
☆322Updated 5 months ago
LCLM-Horizon / A-Comprehensive-Survey-For-Long-Context-Language-Modeling
A Comprehensive Survey on Long Context Language Modeling
☆197Updated 3 months ago
Zanette-Labs / SpeculativeRejection
[NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection
☆52Updated last year
Parallel-Reasoning / APR
[COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models
☆132Updated 2 months ago
infinigence / LVEval
Repository of LV-Eval Benchmark
☆70Updated last year
QwenLM / ProcessBench
Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"
☆174Updated 5 months ago
GAIR-NLP / OctoThinker
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆179Updated 3 months ago
SkyworkAI / skywork-o1-prm-inference
☆65Updated 11 months ago
ltzheng / SimpleTIR
End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆308Updated last month
PRIME-RL / ImplicitPRM
Repo of paper "Free Process Rewards without Process Labels"
☆164Updated 7 months ago
sail-sg / oat-zero
A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.
☆247Updated 6 months ago
GAIR-NLP / AIME-Preview
☆75Updated 7 months ago
princeton-nlp / HELMET
The HELMET Benchmark
☆178Updated 2 months ago
yyht / openrlhf_async_pipline
☆83Updated 2 months ago
openpsi-project / ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
☆322Updated 6 months ago
OpenBMB / InfiniteBench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
☆353Updated last year
OpenSparseLLMs / Linear-MoE
☆120Updated 4 months ago
thunlp / Ouroboros
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)
☆111Updated 7 months ago
richardodliu / OpenCodeEval
☆47Updated 2 months ago