Sohl-Dickstein / fractalLinks
The boundary of neural network trainability is fractal
☆204Updated last year
Alternatives and similar repositories for fractal
Users that are interested in fractal are comparing it to the libraries listed below
Sorting:
- ☆150Updated 9 months ago
- Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.☆170Updated 2 years ago
- Minimal GPT (~350 lines with a simple task to test it)☆62Updated 5 months ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆84Updated 2 months ago
- Compositional Linear Algebra☆474Updated last week
- ☆36Updated 5 months ago
- ☆185Updated 6 months ago
- 🧱 Modula software package☆194Updated 2 months ago
- Uncertainty quantification with PyTorch☆357Updated last month
- ☆433Updated 7 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆139Updated 2 weeks ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds☆239Updated 3 months ago
- ☆267Updated 10 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆100Updated 5 months ago
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆175Updated this week
- Implementation of Diffusion Transformer (DiT) in JAX☆276Updated 11 months ago
- A package for defining deep learning models using categorical algebraic expressions.☆60Updated 10 months ago
- [ICLR'25] Artificial Kuramoto Oscillatory Neurons☆89Updated 2 weeks ago
- Patched Attention for Nonlinear Dynamics☆132Updated last week
- A 1D analogue of the MNIST dataset for measuring spatial biases and answering Science of Deep Learning questions.☆223Updated 7 months ago
- Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"☆554Updated 11 months ago
- Graph neural networks in JAX.☆67Updated 11 months ago
- ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).☆250Updated this week
- Getting crystal-like representations with harmonic loss☆187Updated 2 months ago
- Flow-matching algorithms in JAX☆92Updated 9 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆190Updated last year
- An interactive exploration of Transformer programming.☆264Updated last year
- Official Implementation of the ICML 2023 paper: "Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally …☆72Updated 2 years ago
- The AdEMAMix Optimizer: Better, Faster, Older.☆183Updated 8 months ago
- Parameter-Free Optimizers for Pytorch☆129Updated last year