HomebrewML / revlib
Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload
☆126Updated 2 years ago
Alternatives and similar repositories for revlib:
Users that are interested in revlib are comparing it to the libraries listed below
- Named tensors with first-class dimensions for PyTorch☆321Updated last year
- Drop-in replacement for any ResNet with a significantly reduced memory footprint and better representation capabilities☆209Updated 9 months ago
- FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.☆204Updated last year
- Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch☆251Updated 2 years ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆210Updated 2 years ago
- A case study of efficient training of large language models using commodity hardware.☆68Updated 2 years ago
- Contrastive Language-Image Pretraining☆142Updated 2 years ago
- Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper☆80Updated 3 years ago
- Differentiable Algorithms and Algorithmic Supervision.☆112Updated last year
- Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).☆107Updated 2 years ago
- Implementation of Feedback Transformer in Pytorch☆105Updated 3 years ago
- HomebrewNLP in JAX flavour for maintable TPU-Training☆48Updated last year
- Train ImageNet *fast* in 500 lines of code with FFCV☆139Updated 9 months ago
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆168Updated 2 months ago
- ☆199Updated 2 years ago
- Pretrained deep learning models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet, etc.☆246Updated last year
- ☆98Updated 3 years ago
- Unofficial JAX implementations of deep learning research papers☆153Updated 2 years ago
- Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI☆84Updated 3 years ago
- Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.☆102Updated 3 years ago
- DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight …☆235Updated last year
- TF/Keras code for DiffStride, a pooling layer with learnable strides.☆124Updated 3 years ago
- 🧀 Pytorch code for the Fromage optimiser.☆123Updated 7 months ago
- A Pytree Module system for Deep Learning in JAX☆213Updated last year
- JMP is a Mixed Precision library for JAX.☆191Updated 2 weeks ago
- ☆67Updated last year
- Easy-to-use AdaHessian optimizer (PyTorch)☆77Updated 4 years ago
- LoRA for arbitrary JAX models and functions☆135Updated 11 months ago
- Easy Hypernetworks in Pytorch and Jax☆97Updated 2 years ago
- Another attempt at a long-context / efficient transformer by me☆37Updated 2 years ago