sail-sg / oatLinks
๐พ OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
โ582Updated last month
Alternatives and similar repositories for oat
Users that are interested in oat are comparing it to the libraries listed below
Sorting:
- Reproducible, flexible LLM evaluationsโ305Updated last month
- โ329Updated 6 months ago
- Code for the paper: "Learning to Reason without External Rewards"โ383Updated 5 months ago
- A Gym for Agentic LLMsโ404Updated last month
- RewardBench: the first evaluation tool for reward models.โ670Updated 6 months ago
- A project to improve skills of large language modelsโ665Updated this week
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"โ270Updated 2 months ago
- Understanding R1-Zero-Like Training: A Critical Perspectiveโ1,173Updated 3 months ago
- [NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Exampleโ385Updated last month
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsโฆโ360Updated last year
- Tina: Tiny Reasoning Models via LoRAโ310Updated 2 months ago
- A simple unified framework for evaluating LLMsโ257Updated 8 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.โ452Updated last year
- โ1,035Updated 5 months ago
- Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike statโฆโ397Updated last month
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ269Updated last year
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.โ248Updated 8 months ago
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learningโ328Updated last month
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learningโ258Updated 7 months ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"โ565Updated 2 months ago
- SkyRL: A Modular Full-stack RL Library for LLMsโ1,394Updated this week
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasksโ254Updated 7 months ago
- โ200Updated 8 months ago
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewardsโ1,276Updated last week
- A simplified implementation for experimenting with RLVR on GSM8K, This repository provides a starting point for exploring reasoning.โ149Updated 10 months ago
- Automatic evals for LLMsโ567Updated 5 months ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]โ601Updated 4 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.โ336Updated this week
- PyTorch building blocks for the OLMo ecosystemโ563Updated this week
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"โ340Updated last month