facebookresearch / dropout
Code release for "Dropout Reduces Underfitting"
☆313Updated 2 years ago
Alternatives and similar repositories for dropout:
Users that are interested in dropout are comparing it to the libraries listed below
- FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.☆207Updated last year
- ☆301Updated 10 months ago
- Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time☆462Updated 9 months ago
- Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch☆251Updated 2 years ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆213Updated 2 years ago
- Official PyTorch Implementation of "Learning to Learn with Generative Models of Neural Network Checkpoints"☆340Updated 2 years ago
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆286Updated last month
- ☆201Updated last year
- Probing the representations of Vision Transformers.☆324Updated 2 years ago
- A library to inspect and extract intermediate layers of PyTorch models.☆473Updated 2 years ago
- ☆205Updated 2 years ago
- ☆184Updated last year
- Named tensors with first-class dimensions for PyTorch☆320Updated last year
- [NeurIPS 2022] Official PyTorch implementation of Optimizing Relevance Maps of Vision Transformers Improves Robustness. This code allows …☆127Updated 2 years ago
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"☆377Updated last year
- A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to fac…☆228Updated 3 months ago
- ☆131Updated 2 years ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆203Updated last year
- Official code for "TOAST: Transfer Learning via Attention Steering"☆189Updated last year
- Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch☆226Updated 7 months ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆120Updated 9 months ago
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆407Updated 3 months ago
- Sequence modeling with Mega.☆294Updated 2 years ago
- understanding model mistakes with human annotations☆106Updated 2 years ago
- Unofficial JAX implementations of deep learning research papers☆156Updated 2 years ago
- Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).☆225Updated 3 years ago
- Library for 8-bit optimizers and quantization routines.☆716Updated 2 years ago
- Official repository for "Revisiting Weakly Supervised Pre-Training of Visual Perception Models". https://arxiv.org/abs/2201.08371.☆179Updated 3 years ago
- ☆166Updated last year
- Masked Siamese Networks for Label-Efficient Learning (https://arxiv.org/abs/2204.07141)☆456Updated 2 years ago