galilai-group / stable-pretrainingLinks
Reliable, minimal and scalable library for pretraining foundation and world models
☆118Updated last week
Alternatives and similar repositories for stable-pretraining
Users that are interested in stable-pretraining are comparing it to the libraries listed below
Sorting:
- ☆215Updated last year
- ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).☆337Updated 6 months ago
- [ICLR'25] Artificial Kuramoto Oscillatory Neurons☆107Updated 3 months ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds☆347Updated 2 months ago
- ☆238Updated last year
- ☆289Updated last year
- 🧱 Modula software package☆322Updated 5 months ago
- ☆123Updated 7 months ago
- Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.…☆105Updated last year
- Library that provides metrics to assess representation quality☆20Updated 11 months ago
- Deep Learning, an Energy Approach☆239Updated 7 months ago
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆191Updated 2 weeks ago
- Efficient optimizers☆280Updated last month
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆107Updated 2 months ago
- Modern Fixed Point Systems using Pytorch☆125Updated 2 years ago
- Implementation of Diffusion Transformer (DiT) in JAX☆305Updated last year
- Code and weights for the paper "Cluster and Predict Latents Patches for Improved Masked Image Modeling"☆129Updated 2 weeks ago
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).☆130Updated last year
- ☆69Updated 2 years ago
- Research Project Template Repository☆40Updated 2 weeks ago
- 👋 Overcomplete is a Vision-based SAE Toolbox☆117Updated last month
- ☆27Updated 4 months ago
- A template for starting reproducible Python machine-learning projects with hardware acceleration. Find an example at https://github.com/C…☆114Updated 7 months ago
- Universal Notation for Tensor Operations in Python.☆463Updated 9 months ago
- Accelerated First Order Parallel Associative Scan☆196Updated 3 weeks ago
- Parameter-Free Optimizers for Pytorch☆130Updated last year
- ☆62Updated last year
- Scalable and Stable Parallelization of Nonlinear RNNS☆28Updated 3 months ago
- The boundary of neural network trainability is fractal☆221Updated last year
- WIP☆93Updated last year