hiverge / cifar10-speedrunLinks
CIFAR-10 speedrun: Trains to 94% accuracy in 1.98 seconds on a single NVIDIA A100 GPU.
☆43Updated 2 months ago
Alternatives and similar repositories for cifar10-speedrun
Users that are interested in cifar10-speedrun are comparing it to the libraries listed below
Sorting:
- train with kittens!☆63Updated last year
- SIMD quantization kernels☆93Updated 3 months ago
- Jax like function transformation engine but micro, microjax☆34Updated last year
- ☆108Updated last week
- JAX implementation of the Mistral 7b v0.2 model☆35Updated last year
- Training code for Sparse Autoencoders on Embedding models☆39Updated 9 months ago
- A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.☆105Updated 2 months ago
- ☆28Updated last year
- Minimal yet performant LLM examples in pure JAX☆214Updated 2 weeks ago
- An introduction to LLM Sampling☆79Updated last year
- ☆105Updated 4 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆179Updated 5 months ago
- You should use PySR to find scaling laws. Here's an example.☆33Updated 2 years ago
- A simple library for scaling up JAX programs☆144Updated last month
- Einsum-like high-level array sharding API for JAX☆34Updated last year
- ☆163Updated 3 weeks ago
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆141Updated 3 months ago
- Experiment of using Tangent to autodiff triton☆81Updated last year
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆120Updated 2 months ago
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.☆18Updated last year
- 🧱 Modula software package☆315Updated 3 months ago
- ☆38Updated last year
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆196Updated last year
- nanoGPT using Equinox☆14Updated 2 years ago
- look how they massacred my boy☆63Updated last year
- NSA Triton Kernels written with GPT5 and Opus 4.1☆66Updated 4 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆108Updated 9 months ago
- Simple Transformer in Jax☆140Updated last year
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆18Updated 4 months ago
- Dion optimizer algorithm☆404Updated this week