ServiceNow / PipelineRLLinks
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
☆291Updated this week
Alternatives and similar repositories for PipelineRL
Users that are interested in PipelineRL are comparing it to the libraries listed below
Sorting:
- Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models☆64Updated 6 months ago
- ☆225Updated 3 weeks ago
- Async pipelined version of Verl☆125Updated 7 months ago
- A Gym for Agentic LLMs☆352Updated this week
- PyTorch building blocks for the OLMo ecosystem☆317Updated this week
- ☆106Updated 3 weeks ago
- Reproducible, flexible LLM evaluations☆264Updated 2 weeks ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆189Updated 8 months ago
- Physics of Language Models, Part 4☆255Updated 3 months ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆564Updated 2 weeks ago
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆185Updated 4 months ago
- Understand and test language model architectures on synthetic tasks.☆237Updated last month
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆241Updated this week
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆179Updated 5 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆78Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆171Updated 4 months ago
- ☆108Updated last year
- FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.☆302Updated last week
- Implementation for FP8/INT8 Rollout for RL training without performence drop.☆268Updated last week
- ☆124Updated 8 months ago
- rl from zero pretrain, can it be done? yes.☆280Updated last month
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of…☆147Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆132Updated 10 months ago
- [Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments☆88Updated this week
- Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike stat…☆348Updated this week
- [COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models☆132Updated 2 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆249Updated 9 months ago
- Replicating O1 inference-time scaling laws☆90Updated 11 months ago
- Storing long contexts in tiny caches with self-study☆213Updated 3 weeks ago
- Triton-based implementation of Sparse Mixture of Experts.☆248Updated last month