SakanaAI / Sudoku-Bench
An AI benchmark for creative, human-like problem solving using Sudoku variants
☆39Updated last week
Alternatives and similar repositories for Sudoku-Bench:
Users that are interested in Sudoku-Bench are comparing it to the libraries listed below
- Official repository of the paper, PokeChamp: an Expert-level Minimax Language Agent for Competitive Pokemon.☆52Updated 3 weeks ago
- CycleQD is a framework for parameter space model merging.☆38Updated 2 months ago
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆62Updated 10 months ago
- Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.☆40Updated 2 months ago
- Checkpointable dataset utilities for foundation model training☆32Updated last year
- ☆52Updated 6 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆87Updated last month
- ☆79Updated last year
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆17Updated last month
- [ICLR 2025] Code for the paper "Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning"☆47Updated 2 months ago
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆47Updated last month
- ☆77Updated 8 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆104Updated 5 months ago
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆21Updated last week
- Official repo of paper LM2☆37Updated 2 months ago
- [ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models☆22Updated 3 weeks ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆37Updated last year
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆71Updated 5 months ago
- ☆19Updated last month
- Official implementation of "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"☆101Updated 2 months ago
- Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation☆58Updated 3 weeks ago
- Triton Implementation of HyperAttention Algorithm☆47Updated last year
- Bootstrapping ARC☆110Updated 5 months ago
- ☆139Updated last week
- ☆53Updated last year
- Code for☆27Updated 4 months ago
- ☆27Updated last month
- Skill Design From AI Feedback☆27Updated last month
- Mamba training library developed by kotoba technologies☆69Updated last year
- A repository for research on medium sized language models.☆76Updated 10 months ago