galilai-group / stable-pretrainingLinks
Reliable, minimal and scalable library for pretraining foundation and world models
☆114Updated last week
Alternatives and similar repositories for stable-pretraining
Users that are interested in stable-pretraining are comparing it to the libraries listed below
Sorting:
- [ICLR'25] Artificial Kuramoto Oscillatory Neurons☆106Updated 2 months ago
- ☆212Updated last year
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds☆340Updated last month
- ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).☆331Updated 5 months ago
- ☆287Updated last year
- Research Project Template Repository☆38Updated 3 weeks ago
- Library for Jacobian descent with PyTorch. It enables the optimization of neural networks with multiple losses (e.g. multi-task learning)…☆291Updated last week
- 🧱 Modula software package☆322Updated 4 months ago
- Library that provides metrics to assess representation quality☆20Updated 11 months ago
- 👋 Overcomplete is a Vision-based SAE Toolbox☆112Updated last month
- ☆313Updated last year
- ☆234Updated last year
- Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.…☆105Updated last year
- Parameter-Free Optimizers for Pytorch☆130Updated last year
- ☆69Updated 2 years ago
- Minimal pretraining script for language modeling in PyTorch. Supporting torch compilation and DDP. It includes a model implementation and…☆42Updated 3 weeks ago
- ☆122Updated 6 months ago
- A template for starting reproducible Python machine-learning projects with hardware acceleration. Find an example at https://github.com/C…☆113Updated 7 months ago
- A comprehensive JAX/NNX library for diffusion and flow matching generative algorithms, featuring DiT (Diffusion Transformer) and its vari…☆125Updated 2 months ago
- IVON optimizer for neural networks based on variational learning.☆80Updated last year
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆188Updated 2 weeks ago
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).☆128Updated last year
- Modern Fixed Point Systems using Pytorch☆125Updated 2 years ago
- Implementation of Diffusion Transformer (DiT) in JAX☆300Updated last year
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆37Updated 3 years ago
- Code and weights for the paper "Cluster and Predict Latents Patches for Improved Masked Image Modeling"☆125Updated 3 weeks ago
- NF-Layers for constructing neural functionals.☆93Updated 2 years ago
- Implementation of https://srush.github.io/annotated-s4☆510Updated 6 months ago
- ☆27Updated 3 months ago
- Efficient optimizers☆280Updated 2 weeks ago