cgraywang / transformer-on-dietLinks

Code repo for "Transformer on a Diet" paper

☆31

Alternatives and similar repositories for transformer-on-diet

Users that are interested in transformer-on-diet are comparing it to the libraries listed below

Sorting:

stanis-morozov / prodige
A supplementary code for Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs.
☆47Updated 5 years ago
TimDettmers / transformer-xl
☆64Updated 5 years ago
ofirpress / PartialShuffle
☆14Updated 6 years ago
cambridgeltl / parameter-factorization
Factorization of the neural parameter space for zero-shot multi-lingual and multi-task transfer
☆39Updated 4 years ago
ofirpress / sandwich_transformer
This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer …
☆55Updated 4 years ago
zackchase / machine-learning-resources
A (possibly/eventually annotated?) collection of resources (books, demos, lectures, etc) that I personally like for various topics in mac…
☆32Updated 6 years ago
nyu-mll / pretraining-learning-curves
The repository for the paper "When Do You Need Billions of Words of Pretraining Data?"
☆21Updated 4 years ago
kyunghyuncho / backprop-kalman-filter
☆45Updated 5 years ago
prajjwal1 / fluence
A deep learning library based on Pytorch focussed on low resource language research and robustness
☆70Updated 3 years ago
nttcslab-nlp / doc_lm
☆12Updated 6 years ago
shoarora / transformers-trainers
Tools for training pytorch language models
☆27Updated 4 years ago
seraphlabs-ca / SentenceMIM-demo
This repo contains code to reproduce some of the results presented in the paper "SentenceMIM: A Latent Variable Language Model"
☆28Updated 3 years ago
ART-Group-it / GASP
GASP! Dataset - Generating Abstracts of Scientific Papers from Abstracts of Cited Papers
☆9Updated 5 years ago
elanmart / psmm
☆49Updated 7 years ago
srush / VirtualTeaching
DIY setup for virtual teaching on ubuntu
☆39Updated 4 years ago
felixgwu / FastFusionNet
A PyTorch Implementation of FastFusionNet on SQuAD 1.1
☆39Updated 6 years ago
viking-sudo-rm / stacknn-core
Pip-installable differentiable stacks in PyTorch!
☆65Updated 4 years ago
omarsar / emotion_analysis_elastic_pytorch
Deep Emotion Analysis with Elastic and PyTorch
☆16Updated 6 years ago
agadetsky / pytorch-definitions
[ACL 2018] Conditional Generators of Words Definitions
☆33Updated 7 years ago
Smerity / pytorch-lamb
Implementation of https://arxiv.org/abs/1904.00962
☆15Updated 5 years ago
zbloss / reformer_lm
a Pytorch implementation of the Reformer Network (https://openreview.net/pdf?id=rkgNKkHtvB)
☆53Updated 2 years ago
IBM / HOTT
Code for NeurIPS 2019 paper "Hierarchical Optimal Transport for Document Representation"
☆54Updated 5 years ago
lyeoni / pretraining-for-language-understanding
Pre-training of Language Models for Language Understanding
☆83Updated 5 years ago
cybertronai / transformer-xl
Training Transformer-XL on 128 GPUs
☆140Updated 5 years ago
MultiPath / Efficient-Neural-Machine-Translation
PhD thesis (updating) of Jiatao Gu from HKU
☆19Updated 6 years ago
Kyubyong / lm_finetuning
Language Model Fine-tuning for Moby Dick
☆42Updated 6 years ago
contentinnovation / NeurIPS-2018-papers
Machine-generated summaries and highlights of the every accepted paper at Thirty-second Conference on Neural Information Processing Syste…
☆71Updated 6 years ago
dennybritz / papergraph-ui
Browse the CS/AI/ML research paper graph
☆51Updated 2 years ago
neubig / howtocode-2017
An example of DyNet autobatching for the NIPS "how to code a paper" workshop
☆12Updated 7 years ago
fdalvi / NeuroX-demo
☆66Updated 2 years ago