NVlabs / RLPLinks
RLP: Reinforcement as a Pretraining Objective
☆69Updated this week
Alternatives and similar repositories for RLP
Users that are interested in RLP are comparing it to the libraries listed below
Sorting:
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆99Updated 2 months ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆112Updated 2 months ago
- ☆111Updated this week
- Learn online intrinsic rewards from LLM feedback☆43Updated 9 months ago
- Official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆40Updated this week
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆64Updated 7 months ago
- A simple, performant and scalable JAX-based world modeling codebase☆76Updated this week
- Simple repository for training small reasoning models☆40Updated 7 months ago
- A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks☆35Updated 11 months ago
- ☆46Updated last year
- ☆33Updated 8 months ago
- AIRA-dojo: a framework for developing and evaluating AI research agents☆95Updated last week
- Efficient World Models with Context-Aware Tokenization. ICML 2024☆108Updated last year
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆68Updated 9 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆84Updated 11 months ago
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆55Updated 2 months ago
- AlgoTune is a NeurIPS 2025 benchmark made up of 154 math, physics, and computer science problems. The goal is write code that solves each…☆61Updated last week
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆151Updated 2 weeks ago
- Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL☆58Updated 11 months ago
- A Gymnasium-based Environment of the Abstraction and Reasoning Corpus (ARC)☆68Updated last year
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆41Updated last year
- The official implementation of Self-Exploring Language Models (SELM)☆64Updated last year
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Updated last year
- NanoGPT (124M) quality in 2.67B tokens☆28Updated 2 weeks ago
- Reinforcing General Reasoning without Verifiers☆87Updated 3 months ago
- Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models☆63Updated 5 months ago
- ☆85Updated last year
- EvaByte: Efficient Byte-level Language Models at Scale☆109Updated 5 months ago
- 📄Small Batch Size Training for Language Models☆62Updated this week
- A testbed for agents and environments that can automatically improve models through data generation.☆27Updated 7 months ago