100M tokens. Infinite compute. Lowest val loss wins.
☆503Jun 15, 2026Updated 2 weeks ago
Alternatives and similar repositories for slowrun
Users that are interested in slowrun are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for "What really matters in matrix-whitening optimizers?"☆24Oct 31, 2025Updated 7 months ago
- My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.☆38Apr 30, 2026Updated last month
- ☆26Feb 20, 2026Updated 4 months ago
- A power user focused interface for LLM base models.☆72Updated this week
- ☆34Nov 30, 2025Updated 6 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆10Oct 24, 2024Updated last year
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated 2 years ago
- ☆61Mar 13, 2026Updated 3 months ago
- Trains small LMs. Designed for training on SimpleStories☆14Sep 15, 2025Updated 9 months ago
- toy reproduction of Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts☆31Sep 1, 2024Updated last year
- The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size☆19May 19, 2019Updated 7 years ago
- Efficient Scaling laws and collaborative pretraining.☆22Sep 18, 2025Updated 9 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆196Jan 19, 2026Updated 5 months ago
- Benchmarking Optimizers for LLM Pretraining☆60May 3, 2026Updated last month
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Add ability to interrupt own message☆14Apr 21, 2024Updated 2 years ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆144May 6, 2026Updated last month
- Code for minimum-entropy coupling.☆33Jan 6, 2026Updated 5 months ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)☆30Feb 6, 2026Updated 4 months ago
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- [NeurIPS 2025, Spotlight]: Ambient-o: Training Good models with Bad Data.☆35Apr 6, 2026Updated 2 months ago
- Rose (n-way) trees with both upwards- (i.e. cached) and downwards-traveling (i.e. accumulating) monoidal annotations.☆16Apr 18, 2026Updated 2 months ago
- ☆77Jun 9, 2026Updated 2 weeks ago
- Polynomial semantics of linear logic☆13Apr 15, 2018Updated 8 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Timelight: Universal Path Generator☆23Aug 24, 2025Updated 10 months ago
- Simple Transformer in Jax☆144Jun 22, 2024Updated 2 years ago
- Host CIFAR-10.2 Data Set☆13Sep 22, 2021Updated 4 years ago
- JAX implementation of VQVAE/VQGAN autoencoders (+FSQ)☆42Jun 6, 2024Updated 2 years ago
- ☆20Jan 27, 2024Updated 2 years ago
- ☆28Oct 7, 2025Updated 8 months ago
- ☆35Jul 5, 2023Updated 2 years ago
- C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welc…☆23Mar 4, 2024Updated 2 years ago
- NanoGPT (124M) in 90 seconds☆5,438Jun 21, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code for the "Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning" paper.☆18Nov 21, 2025Updated 7 months ago
- Official Code for What Makes and Breaks Safety Fine-tuning? A Mechanistic Study (NeurIPS 2024)☆12Oct 31, 2024Updated last year
- Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)☆17Sep 3, 2024Updated last year
- Flax (JAX) implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation☆12May 24, 2021Updated 5 years ago
- Haskell implementation of open games☆13Apr 20, 2016Updated 10 years ago
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆154Oct 2, 2025Updated 8 months ago