ofirpress / shortformerLinks

Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.

☆147

Alternatives and similar repositories for shortformer

Users that are interested in shortformer are comparing it to the libraries listed below

Sorting:

gsarti / lambda-bert
A 🤗-style implementation of BERT using lambda layers instead of self-attention
☆69Updated 4 years ago
lucidrains / marge-pytorch
Implementation of Marge, Pre-training via Paraphrasing, in Pytorch
☆76Updated 4 years ago
allenai / tpu_pretrain
LM Pretraining with PyTorch/TPU
☆135Updated 5 years ago
uds-lsv / bert-stable-fine-tuning
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
☆136Updated last year
lucidrains / charformer-pytorch
Implementation of the GBST block from the Charformer paper, in Pytorch
☆118Updated 4 years ago
jwieting / paraphrastic-representations-at-scale
☆75Updated 4 years ago
ofirpress / sandwich_transformer
This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer …
☆55Updated 4 years ago
patil-suraj / onnx_transformers
Accelerated NLP pipelines for fast inference on CPU. Built with Transformers and ONNX runtime.
☆127Updated 4 years ago
microsoft / xtreme-distil-transformers
XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale
☆155Updated last year
wangcongcong123 / ttt
A package for fine-tuning Transformers with TPUs, written in Tensorflow2.0+
☆38Updated 4 years ago
google-research-datasets / Disfl-QA
A Benchmark Dataset for Understanding Disfluencies in Question Answering
☆63Updated 4 years ago
timoschick / bertram
This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".
☆64Updated 4 years ago
Georgetown-IR-Lab / ExtendedSumm
On Generating Extended Summaries of Long Documents
☆78Updated 4 years ago
cambridgeltl / parameter-factorization
Factorization of the neural parameter space for zero-shot multi-lingual and multi-task transfer
☆39Updated 4 years ago
google-research-datasets / QED
QED: A Framework and Dataset for Explanations in Question Answering
☆117Updated 4 years ago
huggingface / datasets-viewer
Viewer for the 🤗 datasets library.
☆84Updated 4 years ago
martiansideofthemoon / hurdles-longform-qa
Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://a…
☆46Updated 3 years ago
gsarti / t5-flax-gcp
Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP
☆58Updated 3 years ago
yandex-research / graph-glove
PyTorch code for the EMNLP 2020 paper "Embedding Words in Non-Vector Space with Unsupervised Graph Learning"
☆41Updated 4 years ago
nng555 / ssmba
☆62Updated 3 years ago
N-Almarwani / DCT_Sentence_Embedding
Efficient-Sentence-Embedding-using-Discrete-Cosine-Transform
☆17Updated 5 years ago
UriSha / EmbeddinglessNMT
The implementation of "Neural Machine Translation without Embeddings", NAACL 2021
☆33Updated 4 years ago
sobamchan / pytorch-lightning-transformers
Fine-tune transformers with pytorch-lightning
☆44Updated 3 years ago
facebookresearch / UnsupervisedDecomposition
PyTorch original implementation of "Unsupervised Question Decomposition for Question Answering"
☆121Updated last year
allenai / sledgehammer
☆47Updated 5 years ago
TimDettmers / transformer-xl
☆64Updated 5 years ago
CurationCorp / curation-corpus
Code for obtaining the Curation Corpus abstractive text summarisation dataset
☆128Updated 4 years ago
FreddeFrallan / Contrastive-Tension
State of the art Semantic Sentence Embeddings
☆99Updated 3 years ago
wietsedv / gpt2-recycle
As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)
☆48Updated 4 years ago
cimeister / typical-sampling
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
☆82Updated 3 years ago