agentica-project / verl-pipelineLinks
Async pipelined version of Verl
☆125Updated 7 months ago
Alternatives and similar repositories for verl-pipeline
Users that are interested in verl-pipeline are comparing it to the libraries listed below
Sorting:
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆237Updated 2 months ago
- Implementation for FP8/INT8 Rollout for RL training without performence drop.☆269Updated 2 weeks ago
- Reproducing R1 for Code with Reliable Rewards☆271Updated 6 months ago
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆188Updated 4 months ago
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆117Updated 11 months ago
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.☆248Updated 7 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆114Updated 3 months ago
- A Comprehensive Survey on Long Context Language Modeling☆203Updated 4 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆180Updated 3 months ago
- ☆27Updated 2 months ago
- (best/better) practices of megatron on veRL and tuning guide☆102Updated last month
- The HELMET Benchmark☆184Updated 3 months ago
- ☆65Updated 11 months ago
- [NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.☆206Updated 5 months ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆52Updated last year
- ☆120Updated 5 months ago
- ☆326Updated 5 months ago
- CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings☆55Updated 9 months ago
- Resources for the Enigmata Project.☆73Updated 3 months ago
- ☆212Updated 9 months ago
- ☆86Updated 3 months ago
- [COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models☆132Updated 3 months ago
- Bridge Megatron-Core to Hugging Face/Reinforcement Learning☆159Updated last week
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆176Updated 6 months ago
- ☆76Updated last year
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆143Updated last year
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆318Updated 2 months ago
- MiroRL is an MCP-first reinforcement learning framework for deep research agent.☆172Updated 2 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆257Updated 6 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆116Updated 6 months ago