eric-haibin-lin / verl-communityLinks
☆37Updated 3 months ago
Alternatives and similar repositories for verl-community
Users that are interested in verl-community are comparing it to the libraries listed below
Sorting:
- Async pipelined version of Verl☆125Updated 7 months ago
- Implementation for FP8/INT8 Rollout for RL training without performence drop.☆275Updated 3 weeks ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆52Updated last year
- Repository of LV-Eval Benchmark☆71Updated last year
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…☆83Updated 3 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆237Updated 2 months ago
- (best/better) practices of megatron on veRL and tuning guide☆103Updated 2 months ago
- ☆86Updated 3 months ago
- LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models☆76Updated last year
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆324Updated 7 months ago
- A Comprehensive Survey on Long Context Language Modeling☆204Updated this week
- Bridge Megatron-Core to Hugging Face/Reinforcement Learning☆164Updated this week
- ☆65Updated last year
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆112Updated 8 months ago
- Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs☆195Updated last month
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆36Updated 9 months ago
- MiroRL is an MCP-first reinforcement learning framework for deep research agent.☆179Updated 3 months ago
- ☆317Updated this week
- Training library for Megatron-based models☆209Updated this week
- ☆120Updated 5 months ago
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient☆60Updated 3 months ago
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆50Updated last year
- ☆77Updated 8 months ago
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.☆248Updated 7 months ago
- [COLM 2025] An Open Math Pre-trainng Dataset with 370B Tokens.☆107Updated 7 months ago
- ☆207Updated last month
- ☆62Updated 3 weeks ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆79Updated 2 months ago
- Nano repo for RL training of LLMs☆68Updated last month
- REST: Retrieval-Based Speculative Decoding, NAACL 2024☆211Updated 2 months ago