SakanaAI / ALE-BenchLinks
The official repository of ALE-Bench
☆78Updated last week
Alternatives and similar repositories for ALE-Bench
Users that are interested in ALE-Bench are comparing it to the libraries listed below
Sorting:
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆122Updated this week
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆63Updated last year
- An AI benchmark for creative, human-like problem solving using Sudoku variants☆70Updated last month
- CycleQD is a framework for parameter space model merging.☆40Updated 4 months ago
- ☆22Updated last month
- ☆115Updated 4 months ago
- Esoteric Language Models☆77Updated last week
- ☆47Updated last month
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆42Updated this week
- ☆51Updated 7 months ago
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆23Updated 2 months ago
- ☆74Updated 2 months ago
- Measuring General Intelligence With Generated Games (Preprint)☆24Updated 3 weeks ago
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆75Updated 10 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆24Updated 3 months ago
- Official Code Release for "Training a Generally Curious Agent"☆25Updated last month
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆18Updated last month
- a benchmark to evaluate the situated inductive reasoning☆16Updated 5 months ago
- Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.☆44Updated 4 months ago
- ☆79Updated 10 months ago
- A repository for research on medium sized language models.☆76Updated last year
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated this week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- Plug in & Play Pytorch Implementation of the paper: "Evolutionary Optimization of Model Merging Recipes" by Sakana AI☆30Updated 7 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆93Updated 2 weeks ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆68Updated 3 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆102Updated 2 months ago
- Official repo for Learning to Reason for Long-Form Story Generation☆63Updated 2 months ago
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆58Updated 4 months ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆39Updated 7 months ago