wiedersehne / ParamixerLinks
Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention (CVPR 2022)
☆20Updated 2 years ago
Alternatives and similar repositories for Paramixer
Users that are interested in Paramixer are comparing it to the libraries listed below
Sorting:
- ☆41Updated 4 years ago
- Implementation of the Remixer Block from the Remixer paper, in Pytorch☆36Updated 3 years ago
- Official code for NeurIPS paper "Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable Approach".☆16Updated 3 years ago
- ☆20Updated 2 years ago
- Simple notebooks to learn diffusion models on toy datasets☆17Updated 2 years ago
- Official PyTorch implementation of LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification☆46Updated 3 years ago
- Architecture embeddings independent from the parametrization of the search space☆15Updated 4 years ago
- Implementation of Kronecker Attention in Pytorch☆19Updated 4 years ago
- Cyclic Differentiable Architecture Search☆36Updated 3 years ago
- Official implementation of "UNAS: Differentiable Architecture Search Meets Reinforcement Learning", CVPR 2020 Oral☆61Updated last year
- Code for BlockSwap (ICLR 2020).☆33Updated 4 years ago
- [NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"☆72Updated 2 years ago
- Code base for SRSGD.☆28Updated 5 years ago
- We investigated corruption robustness across different architectures including Convolutional Neural Networks, Vision Transformers, and th…☆16Updated 3 years ago
- ☆25Updated 3 years ago
- Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process wi…☆51Updated 3 years ago
- Piecewise Linear Functions (PWL) implementation in PyTorch☆53Updated 3 years ago
- PyTorch and Torch implementation for our accepted CVPR 2020 paper (Oral): Controllable Orthogonalization in Training DNNs☆24Updated 4 years ago
- A highly modular PyTorch framework with a focus on Neural Architecture Search (NAS).☆23Updated 3 years ago
- Code repository for the ICLR 2022 paper "FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes" https://openreview.ne…☆116Updated 2 years ago
- DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training (ICLR 2023)☆31Updated 2 years ago
- ImageNet-12k subset of ImageNet-21k (fall11)☆21Updated 2 years ago
- Successfully training approximations to full-rank matrices for efficiency in deep learning.☆17Updated 4 years ago
- A project to improve out-of-distribution detection (open set recognition) and uncertainty estimation by changing a few lines of code in y…☆45Updated 2 years ago
- [CVPR 2021] Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator☆38Updated 3 years ago
- [NeurIPS 2021] “Stronger NAS with Weaker Predictors“, Junru Wu, Xiyang Dai, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Ye Yu, Zhangyang W…☆27Updated 2 years ago
- [CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jon…☆69Updated 2 years ago
- Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection☆21Updated 4 years ago
- [Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…☆40Updated 2 years ago
- Paper and Code for "Curriculum Learning by Optimizing Learning Dynamics" (AISTATS 2021)☆19Updated 4 years ago