SakanaAI / Sudoku-Bench
An AI benchmark for creative, human-like problem solving using Sudoku variants
☆43Updated this week
Alternatives and similar repositories for Sudoku-Bench:
Users that are interested in Sudoku-Bench are comparing it to the libraries listed below
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆61Updated 10 months ago
- ☆78Updated 8 months ago
- CycleQD is a framework for parameter space model merging.☆39Updated 3 months ago
- Triton Implementation of HyperAttention Algorithm☆48Updated last year
- Official repository of the paper, PokeChamp: an Expert-level Minimax Language Agent for Competitive Pokemon.☆57Updated last month
- EvaByte: Efficient Byte-level Language Models at Scale☆92Updated 2 weeks ago
- Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.☆41Updated 3 months ago
- ☆54Updated 8 months ago
- ☆52Updated 11 months ago
- Bootstrapping ARC☆115Updated 5 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆73Updated 5 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆114Updated this week
- Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models☆49Updated 2 weeks ago
- ☆146Updated last month
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆72Updated 6 months ago
- Minimal but scalable implementation of large language models in JAX☆34Updated 6 months ago
- ☆130Updated last month
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆72Updated 8 months ago
- Experiments for efforts to train a new and improved t5☆77Updated last year
- supporting pytorch FSDP for optimizers☆80Updated 5 months ago
- ☆31Updated last year
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆30Updated 2 months ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆30Updated this week
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆39Updated 6 months ago
- Mixture of A Million Experts☆44Updated 9 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆114Updated 4 months ago
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆123Updated 8 months ago
- Checkpointable dataset utilities for foundation model training☆32Updated last year
- ☆49Updated last year
- some common Huggingface transformers in maximal update parametrization (µP)☆80Updated 3 years ago