HomebrewNLP / revlib
Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload
☆125Updated 2 years ago
Alternatives and similar repositories for revlib:
Users that are interested in revlib are comparing it to the libraries listed below
- Named tensors with first-class dimensions for PyTorch☆322Updated last year
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆210Updated last year
- Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch☆250Updated 2 years ago
- Contrastive Language-Image Pretraining☆142Updated 2 years ago
- Implementation of Nyström Self-attention, from the paper Nyströmformer☆124Updated 11 months ago
- A case study of efficient training of large language models using commodity hardware.☆68Updated 2 years ago
- Unofficial JAX implementations of deep learning research papers☆152Updated 2 years ago
- Implementation of Feedback Transformer in Pytorch☆105Updated 3 years ago
- Drop-in replacement for any ResNet with a significantly reduced memory footprint and better representation capabilities☆209Updated 8 months ago
- FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.☆202Updated last year
- Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).☆106Updated 2 years ago
- Differentiable Algorithms and Algorithmic Supervision.☆111Updated last year
- JMP is a Mixed Precision library for JAX.☆189Updated last month
- HomebrewNLP in JAX flavour for maintable TPU-Training☆47Updated 11 months ago
- Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI☆84Updated 3 years ago
- Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.☆101Updated 3 years ago
- TF/Keras code for DiffStride, a pooling layer with learnable strides.☆124Updated 2 years ago
- A GPT, made only of MLPs, in Jax☆57Updated 3 years ago
- ☆197Updated 2 years ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆204Updated last year
- Another attempt at a long-context / efficient transformer by me☆37Updated 2 years ago
- [Prototype] Tools for the concurrent manipulation of variably sized Tensors.☆253Updated 2 years ago
- ☆153Updated 4 years ago
- A small demonstration of using WebDataset with ImageNet and PyTorch Lightning☆74Updated last year
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆146Updated last month
- JAX Synergistic Memory Inspector☆164Updated 6 months ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)☆184Updated 2 years ago
- ☆367Updated last year
- Pretrained deep learning models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet, etc.☆240Updated last year