samiraabnar / ReflectLinks
Official Implementation of "Transferring Inductive Biases Through Knowledge Distillation"
☆15Updated 5 years ago
Alternatives and similar repositories for Reflect
Users that are interested in Reflect are comparing it to the libraries listed below
Sorting:
- ☆24Updated 8 months ago
- Implementation of the GLOM model for text☆11Updated 4 years ago
- A supplementary code for Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs.☆47Updated 6 years ago
- ☆25Updated last year
- PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"☆62Updated last year
- Pretrained TorchVision models on CIFAR10 dataset (with weights)☆24Updated 5 years ago
- Code for gradient rollback, which explains predictions of neural matrix factorization models, as for example used for knowledge base comp…☆21Updated 4 years ago
- ☆24Updated 5 years ago
- A collection of Models, Datasets, DataModules, Callbacks, Metrics, Losses and Loggers to better integrate pytorch-lightning with transfor…☆47Updated 2 years ago
- Code for "MIM: Mutual Information Machine" paper.☆15Updated 3 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆49Updated 4 years ago
- [NeurIPS 2020] Official Implementation: "SMYRF: Efficient Attention using Asymmetric Clustering".☆50Updated 2 years ago
- A pytorch implementation for the LSTM experiments in the paper: Why Gradient Clipping Accelerates Training: A Theoretical Justification f…☆46Updated 5 years ago
- An implementation of (Induced) Set Attention Block, from the Set Transformers paper☆65Updated 2 years ago
- This repository provides the code for replicating the experiments in the paper "Building One-Shot Semi-supervised (BOSS) Learning up to F…☆36Updated 5 years ago
- Code repo for "Transformer on a Diet" paper☆31Updated 5 years ago
- MTAdam: Automatic Balancing of Multiple Training Loss Terms☆36Updated 5 years ago
- ☆62Updated 3 years ago
- Code for NeurIPS 2019 paper "Hierarchical Optimal Transport for Document Representation"☆54Updated 5 years ago
- ☆34Updated 7 years ago
- PhD thesis (updating) of Jiatao Gu from HKU☆19Updated 7 years ago
- [ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845☆120Updated 4 years ago
- A simple implementation of a deep linear Pytorch module☆21Updated 5 years ago
- Code publication to the paper "Normalized Attention Without Probability Cage"☆17Updated 4 years ago
- Jupyter notebook on Gumbel-max and Gumbel-softmax tricks☆40Updated 3 years ago
- ☆65Updated 5 years ago
- Code for Unsupervised Discovery of Multimodal Links in Multi-Image/Multi-Sentence Documents☆30Updated 5 years ago
- Implementation of experiments in paper "Learning from Rules Generalizing Labeled Exemplars" to appear in ICLR2020 (https://openreview.net…☆50Updated 2 years ago
- Humans understand novel sentences by composing meanings and roles of core language components. In contrast, neural network models for nat…☆27Updated 5 years ago
- Code for our paper: "Regularity Normalization: Neuroscience-Inspired Unsupervised Attention across Neural Network Layers".☆20Updated 4 years ago