wiedersehne / ParamixerLinks
Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention (CVPR 2022)
☆20Updated 3 years ago
Alternatives and similar repositories for Paramixer
Users that are interested in Paramixer are comparing it to the libraries listed below
Sorting:
- Architecture embeddings independent from the parametrization of the search space☆15Updated 4 years ago
- Piecewise Linear Functions (PWL) implementation in PyTorch☆57Updated 3 years ago
- [NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"☆74Updated 3 years ago
- ☆41Updated 4 years ago
- ImageNet-12k subset of ImageNet-21k (fall11)☆21Updated 2 years ago
- Implementation of the Remixer Block from the Remixer paper, in Pytorch☆36Updated 4 years ago
- A highly modular PyTorch framework with a focus on Neural Architecture Search (NAS).☆23Updated 4 years ago
- Official code for NeurIPS paper "Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable Approach".☆16Updated 3 years ago
- Automatic learning-rate scheduler☆46Updated 4 years ago
- This is the official PyTorch implementation for "Mesa: A Memory-saving Training Framework for Transformers".☆121Updated 4 years ago
- PyTorch implementation of IRMAE https//arxiv.org/abs/2010.00679☆48Updated 3 years ago
- Official Implementation of Convolutional Normalization: Improving Robustness and Training for Deep Neural Networks☆30Updated 3 years ago
- This repo is for our paper: Normalization Techniques in Training DNNs: Methodology, Analysis and Application☆85Updated 4 years ago
- Implementation of Kronecker Attention in Pytorch☆19Updated 5 years ago
- ☆25Updated 4 years ago
- Drop-in replacement for any ResNet with a significantly reduced memory footprint and better representation capabilities☆208Updated last year
- [CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jon…☆68Updated 3 years ago
- ☆20Updated 2 years ago
- Paper and Code for "Curriculum Learning by Optimizing Learning Dynamics" (AISTATS 2021)☆19Updated 4 years ago
- PyTorch and Torch implementation for our accepted CVPR 2020 paper (Oral): Controllable Orthogonalization in Training DNNs☆25Updated 4 years ago
- Official implementation of "UNAS: Differentiable Architecture Search Meets Reinforcement Learning", CVPR 2020 Oral☆63Updated 2 years ago
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Updated 3 years ago
- DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training (ICLR 2023)☆32Updated 2 years ago
- [Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…☆41Updated 3 months ago
- (ECCV 2022) BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks☆51Updated 3 years ago