100M tokens. Infinite compute. Lowest val loss wins.
☆310Mar 19, 2026Updated this week
Alternatives and similar repositories for slowrun
Users that are interested in slowrun are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for "What really matters in matrix-whitening optimizers?"☆23Oct 31, 2025Updated 4 months ago
- ☆25Feb 20, 2026Updated last month
- 6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.☆22Feb 19, 2026Updated last month
- Leo optimizer, variation of Muon that runs faster☆58Sep 6, 2025Updated 6 months ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- ☆48Mar 13, 2026Updated last week
- ☆31Nov 30, 2025Updated 3 months ago
- Trains small LMs. Designed for training on SimpleStories☆12Sep 15, 2025Updated 6 months ago
- ☆10Oct 24, 2024Updated last year
- The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size☆19May 19, 2019Updated 6 years ago
- ☆36Feb 26, 2024Updated 2 years ago
- Add ability to interrupt own message☆14Apr 21, 2024Updated last year
- ☆256Dec 2, 2024Updated last year
- Code for minimum-entropy coupling.☆33Jan 6, 2026Updated 2 months ago
- ☆48Jul 21, 2025Updated 8 months ago
- ☆16Aug 7, 2024Updated last year
- Automatically review Claude Code plans using external AI CLIs☆52Mar 2, 2026Updated 3 weeks ago
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- Timelight: Universal Path Generator☆23Aug 24, 2025Updated 6 months ago
- Simple Transformer in Jax☆143Jun 22, 2024Updated last year
- Host CIFAR-10.2 Data Set☆13Sep 22, 2021Updated 4 years ago
- JAX implementation of VQVAE/VQGAN autoencoders (+FSQ)☆41Jun 6, 2024Updated last year
- ☆35Jul 5, 2023Updated 2 years ago
- C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welc…☆23Mar 4, 2024Updated 2 years ago
- extending laughbot project to encoder-based transformer model finetuned on same dataset for humor classification☆10Jan 4, 2023Updated 3 years ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆151Oct 2, 2025Updated 5 months ago
- NanoGPT (124M) in 2 minutes☆4,848Mar 17, 2026Updated last week
- ☆57Jun 23, 2025Updated 9 months ago
- A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.☆69Updated this week
- Computational abilities and efficiency of neural networks☆56Jul 19, 2025Updated 8 months ago
- ☆62Apr 12, 2025Updated 11 months ago
- official code for paper Probing the Decision Boundaries of In-context Learning in Large Language Models. https://arxiv.org/abs/2406.11233…☆19Jul 27, 2025Updated 7 months ago
- Orca is a workspace for vibe coding built upon the principals of tracking what the agent changes and only keeping what you want☆52Updated this week
- Basemap.de world vector with a photon geocoder packaged as tauri app for any device☆12Jan 5, 2025Updated last year
- a catch-all repo☆11Dec 28, 2023Updated 2 years ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆81Aug 30, 2023Updated 2 years ago
- CIFAR-10 speedrun: Trains to 94% accuracy in 1.98 seconds on a single NVIDIA A100 GPU.☆66Oct 17, 2025Updated 5 months ago
- A neural network from scratch in C++ using MLX☆19Oct 1, 2025Updated 5 months ago
- Minimal open-source implementation of AlphaProof and HyperTree Proof Search.☆72Mar 9, 2026Updated 2 weeks ago