erikwijmans / skynet-ddp-slurm-exampleLinks
Example of using PyTorch DistributedDataParallel and SLURM on skynet
☆30Updated 4 years ago
Alternatives and similar repositories for skynet-ddp-slurm-example
Users that are interested in skynet-ddp-slurm-example are comparing it to the libraries listed below
Sorting:
- [ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845☆120Updated 4 years ago
- ☆62Updated 5 years ago
- ☆165Updated 7 years ago
- Labels and other data for the paper "Are we done with ImageNet?"☆199Updated 4 years ago
- ☆43Updated 6 years ago
- Code for the ICML'20 paper "Improving Transformer Optimization Through Better Initialization"☆89Updated 4 years ago
- Implementation of https://arxiv.org/abs/1904.00962☆377Updated 5 years ago
- A Re-implementation of Fixed-update Initialization☆155Updated 6 years ago
- Implementation of Sparsemax activation in Pytorch☆166Updated 5 years ago
- Big-Interleaved-Dataset☆58Updated 2 years ago
- "Layer-wise Adaptive Rate Scaling" in PyTorch☆87Updated 4 years ago
- PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"☆62Updated last year
- Profile the GPU memory usage of every line in a Pytorch code☆83Updated 7 years ago
- ☆16Updated 3 years ago
- Code for Multi-Head Attention: Collaborate Instead of Concatenate☆153Updated 2 years ago
- A PyTorch converter for SimCLR checkpoints☆108Updated 4 years ago
- A minimal pytorch package implementing a gradient reversal layer.☆158Updated last year
- An implementation of shampoo☆78Updated 7 years ago
- Code for SelfAugment☆27Updated 5 years ago
- Parameter Efficient Transfer Learning with Diff Pruning☆74Updated 4 years ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆80Updated 3 years ago
- Implementation of the reversible residual network in pytorch☆106Updated 3 years ago
- ☆47Updated 4 years ago
- Code for "Understanding and Improving Layer Normalization"☆46Updated 6 years ago
- CLASP - Contrastive Language-Aminoacid Sequence Pretraining☆143Updated 4 years ago
- Analyzing basic network responses to novel classes☆41Updated 4 years ago
- A small demonstration of using WebDataset with ImageNet and PyTorch Lightning☆75Updated 2 years ago
- Fully featured implementation of Routing Transformer☆299Updated 4 years ago
- Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).☆228Updated 3 years ago
- The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…☆67Updated 3 years ago