sail-sg / oatLinks
πΎ OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
β539Updated this week
Alternatives and similar repositories for oat
Users that are interested in oat are comparing it to the libraries listed below
Sorting:
- Reproducible, flexible LLM evaluationsβ256Updated last week
- β323Updated 4 months ago
- A Gym for Agentic LLMsβ323Updated last week
- RewardBench: the first evaluation tool for reward models.β642Updated 4 months ago
- Tina: Tiny Reasoning Models via LoRAβ299Updated last month
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learningβ291Updated 2 weeks ago
- A project to improve skills of large language modelsβ587Updated this week
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"β266Updated this week
- A simple unified framework for evaluating LLMsβ251Updated 6 months ago
- Understanding R1-Zero-Like Training: A Critical Perspectiveβ1,126Updated last month
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsβ¦β342Updated 10 months ago
- Automatic evals for LLMsβ547Updated 3 months ago
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and trainingβ283Updated last year
- [NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Exampleβ364Updated last week
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"β538Updated 2 weeks ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.β436Updated last year
- SkyRL: A Modular Full-stack RL Library for LLMsβ1,060Updated this week
- A simplified implementation for experimenting with RLVR on GSM8K, This repository provides a starting point for exploring reasoning.β136Updated 8 months ago
- Code for the paper: "Learning to Reason without External Rewards"β364Updated 3 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasksβ246Updated 5 months ago
- PyTorch building blocks for the OLMo ecosystemβ307Updated this week
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.β247Updated 6 months ago
- A version of verl to support diverse tool useβ607Updated this week
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]β551Updated 2 months ago
- β195Updated 6 months ago
- β971Updated 3 months ago
- The HELMET Benchmarkβ177Updated 2 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"β175Updated 4 months ago
- Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike statβ¦β321Updated last week
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"β330Updated 11 months ago