Sohl-Dickstein / fractal
The boundary of neural network trainability is fractal
☆194Updated last year
Alternatives and similar repositories for fractal:
Users that are interested in fractal are comparing it to the libraries listed below
- ☆149Updated 6 months ago
- 🧱 Modula software package☆139Updated this week
- ☆157Updated 2 months ago
- Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.☆164Updated last year
- ☆36Updated 2 months ago
- ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).☆200Updated this week
- 94% on CIFAR-10 in 2.6 seconds 💨 96% in 27 seconds☆205Updated this week
- Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"☆540Updated 7 months ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆48Updated 2 months ago
- Implementation of Diffusion Transformer (DiT) in JAX☆264Updated 8 months ago
- Visualizations of the theory behind diffusion models.☆77Updated 9 months ago
- ☆78Updated 10 months ago
- Minimal GPT (~350 lines with a simple task to test it)☆62Updated 2 months ago
- Implementation of PSGD optimizer in JAX☆28Updated last month
- A package for defining deep learning models using categorical algebraic expressions.☆59Updated 6 months ago
- Flow-matching algorithms in JAX☆83Updated 6 months ago
- The history files when recording human interaction while solving ARC tasks☆97Updated this week
- ☆207Updated 7 months ago
- A State-Space Model with Rational Transfer Function Representation.☆77Updated 8 months ago
- Official Implementation of the ICML 2023 paper: "Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally …☆69Updated last year
- supporting pytorch FSDP for optimizers☆76Updated 2 months ago
- σ-GPT: A New Approach to Autoregressive Models☆61Updated 6 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆95Updated last month
- The AdEMAMix Optimizer: Better, Faster, Older.☆177Updated 5 months ago
- Resources from the EleutherAI Math Reading Group☆52Updated last month
- Bootstrapping ARC☆100Updated 2 months ago
- ☆416Updated 3 months ago
- WIP☆93Updated 6 months ago
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).☆108Updated 3 months ago