esceptico / perceiver-ioLinks
Unofficial implementation of Perceiver IO
☆123Updated 3 years ago
Alternatives and similar repositories for perceiver-io
Users that are interested in perceiver-io are comparing it to the libraries listed below
Sorting:
- Learning to Initialize Neural Networks for Stable and Efficient Training☆139Updated 3 years ago
- My implementation of DeepMind's Perceiver☆63Updated 4 years ago
- EfficientNet, MobileNetV3, MobileNetV2, MixNet, etc in JAX w/ Flax Linen and Objax☆128Updated last year
- Implementation of Feedback Transformer in Pytorch☆107Updated 4 years ago
- My repo for training neural nets using pytorch-lightning and hydra☆221Updated 4 months ago
- Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch☆252Updated 2 years ago
- Collection of the latest, greatest, deep learning optimizers (for Pytorch) - CNN, NLP suitable☆215Updated 4 years ago
- Data Reading Blocks for Python☆105Updated 4 years ago
- 🧀 Pytorch code for the Fromage optimiser.☆124Updated 11 months ago
- ☆68Updated last year
- Easy-to-use AdaHessian optimizer (PyTorch)☆79Updated 4 years ago
- Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload☆127Updated 2 years ago
- An alternative to convolution in neural networks☆256Updated last year
- ☆153Updated 5 years ago
- GAN models implemented with Pytorch Lightning and Hydra configuration☆34Updated 3 years ago
- A tiny Catalyst-like experiment runner framework on top of micrograd.☆51Updated 4 years ago
- Official codebase for Pretrained Transformers as Universal Computation Engines.☆248Updated 3 years ago
- A GPT, made only of MLPs, in Jax☆58Updated 4 years ago
- A small demonstration of using WebDataset with ImageNet and PyTorch Lightning☆74Updated last year
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆204Updated last year
- Lightweight Hyperparameter Optimization 🚂☆147Updated 10 months ago
- Trains Transformer model variants. Data isn't shuffled between batches.☆143Updated 2 years ago
- a lightweight transformer library for PyTorch☆72Updated 3 years ago
- Is the attention layer even necessary? (https://arxiv.org/abs/2105.02723)☆486Updated 4 years ago
- Unofficial JAX implementations of deep learning research papers☆156Updated 3 years ago
- Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions☆258Updated last year
- Official code for the Stochastic Polyak step-size optimizer☆139Updated last year
- graftr: an interactive shell to view and edit PyTorch checkpoints.☆113Updated 4 years ago
- Pre-trained NFNets with 99% of the accuracy of the official paper "High-Performance Large-Scale Image Recognition Without Normalization".☆159Updated 4 years ago
- A simple library that implements CLIP guided loss in PyTorch.☆77Updated 3 years ago