sail-sg / oatLinks
πΎ OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
β604Updated last week
Alternatives and similar repositories for oat
Users that are interested in oat are comparing it to the libraries listed below
Sorting:
- Reproducible, flexible LLM evaluationsβ312Updated last month
- A Gym for Agentic LLMsβ411Updated last week
- β329Updated 7 months ago
- RewardBench: the first evaluation tool for reward models.β674Updated 6 months ago
- Understanding R1-Zero-Like Training: A Critical Perspectiveβ1,180Updated 4 months ago
- A project to improve skills of large language modelsβ727Updated this week
- Minimal hackable GRPO implementationβ308Updated 11 months ago
- A simple unified framework for evaluating LLMsβ258Updated 8 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsβ¦β365Updated last year
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learningβ332Updated 2 months ago
- Code for the paper: "Learning to Reason without External Rewards"β385Updated 5 months ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"β573Updated 2 months ago
- Tina: Tiny Reasoning Models via LoRAβ310Updated 3 months ago
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"β270Updated 2 months ago
- [NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Exampleβ390Updated last month
- Automatic evals for LLMsβ569Updated last week
- β1,050Updated 6 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.β453Updated last year
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasksβ254Updated 7 months ago
- A simplified implementation for experimenting with RLVR on GSM8K, This repository provides a starting point for exploring reasoning.β152Updated 10 months ago
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"β341Updated last month
- A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoningβ280Updated 3 months ago
- β201Updated 8 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"β183Updated 7 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learningβ257Updated 7 months ago
- PyTorch building blocks for the OLMo ecosystemβ634Updated this week
- β218Updated 9 months ago
- Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike statβ¦β407Updated last month
- Recipes to scale inference-time compute of open modelsβ1,122Updated 7 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.β343Updated last week