agentica-project / verl-pipelineLinks
Async pipelined version of Verl
β112Updated 4 months ago
Alternatives and similar repositories for verl-pipeline
Users that are interested in verl-pipeline are comparing it to the libraries listed below
Sorting:
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"β218Updated 5 months ago
- [NeurIPS'24] Official code for *π―DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*β111Updated 7 months ago
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.β245Updated 3 months ago
- Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agentsβ139Updated 3 weeks ago
- A version of verl to support tool useβ315Updated this week
- Reproducing R1 for Code with Reliable Rewardsβ243Updated 3 months ago
- A Comprehensive Survey on Long Context Language Modelingβ169Updated last month
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".β100Updated 3 weeks ago
- Repository of LV-Eval Benchmarkβ68Updated 11 months ago
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithmβ¦β34Updated last week
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoningβ162Updated this week
- β65Updated 8 months ago
- β309Updated 2 months ago
- β113Updated 2 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scalingβ161Updated 2 weeks ago
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"β167Updated 2 months ago
- Codes for the paper "βBench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718β343Updated 10 months ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejectionβ49Updated 9 months ago
- β71Updated 8 months ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluationsβ127Updated 3 months ago
- Bridge Megatron-Core to Hugging Face/Reinforcement Learningβ74Updated this week
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Modelsβ184Updated 10 months ago
- β70Updated last week
- Super-Efficient RLHF Training of LLMs with Parameter Reallocationβ307Updated 3 months ago
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (β¦β204Updated this week
- β71Updated 4 months ago
- β205Updated 5 months ago
- β263Updated 2 months ago
- Repo of paper "Free Process Rewards without Process Labels"β161Updated 4 months ago
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied witβ¦β133Updated last year