sgraaf / Replicate-Toronto-BookCorpusLinks

This repository contains code to replicate the no-longer publicly available Toronto BookCorpus dataset

☆49

Alternatives and similar repositories for Replicate-Toronto-BookCorpus

Users that are interested in Replicate-Toronto-BookCorpus are comparing it to the libraries listed below

Sorting:

allenai / tpu_pretrain
LM Pretraining with PyTorch/TPU
☆135Updated 5 years ago
TimDettmers / transformer-xl
☆64Updated 5 years ago
cambridgeltl / parameter-factorization
Factorization of the neural parameter space for zero-shot multi-lingual and multi-task transfer
☆39Updated 4 years ago
mandarjoshi90 / pair2vec
pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference
☆62Updated 2 years ago
ofirpress / sandwich_transformer
This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer …
☆55Updated 4 years ago
allenai / allentune
Hyperparameter Search for AllenNLP
☆139Updated 4 months ago
carolinlawrence / BiSon
Code for bidirectional sequence generation (BiSon) for generating from BERT pre-trained models.
☆51Updated 5 years ago
jwieting / paraphrastic-representations-at-scale
☆75Updated 4 years ago
allenai / sledgehammer
☆47Updated 5 years ago
neulab / lrlm
Code for the paper "Latent Relation Language Models" at AAAI-20.
☆41Updated 4 years ago
acl-org / acl-2020-virtual-conference
Repository for the ACL 2020 virtual conference website (work in progress)
☆39Updated 3 years ago
yanaiela / num_fh
numeric fused-head identification and resolution
☆33Updated 5 years ago
ofirpress / shortformer
Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.
☆147Updated 4 years ago
tombosc / cpae
Code for EMNLP 2018 paper "Auto-Encoding Dictionary Definitions into Consistent Word Embeddings"
☆36Updated 6 years ago
AkariAsai / extractive_rc_by_runtime_mt
Code and datasets of "Multilingual Extractive Reading Comprehension by Runtime Machine Translation"
☆40Updated 6 years ago
google-research-datasets / wiki-atomic-edits
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contai…
☆106Updated 6 years ago
google-research-datasets / query-wellformedness
25,100 queries from the Paralex corpus (Fader et al., 2013) annotated with human ratings of whether they are well-formed natural languag…
☆84Updated 6 years ago
neulab / langrank
A program to choose transfer languages for cross-lingual learning
☆72Updated 2 years ago
lucidrains / marge-pytorch
Implementation of Marge, Pre-training via Paraphrasing, in Pytorch
☆76Updated 4 years ago
TurkuNLP / wikibert
BERT models for many languages created from Wikipedia texts
☆33Updated 5 years ago
timoschick / bertram
This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".
☆64Updated 4 years ago
ethanjperez / convince
Finding Generalizable Evidence by Learning to Convince Q&A Models
☆25Updated 2 years ago
jekbradbury / revtok
Reversible tokenization in Python.
☆60Updated 6 years ago
uds-lsv / bert-stable-fine-tuning
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
☆136Updated last year
nyu-mll / CoLA-baselines
Baselines and corpus accompanying paper Neural Network Acceptability Judgments
☆56Updated 5 years ago
google-research-datasets / Disfl-QA
A Benchmark Dataset for Understanding Disfluencies in Question Answering
☆63Updated 4 years ago
jwieting / simple-and-effective-paraphrastic-similarity
Python code for training models in the ACL paper, "Simple and Effective Paraphrastic Similarity from Parallel Translations".
☆22Updated 5 years ago
huggingface / bert-syntax
Assessing syntactic abilities of BERT
☆39Updated 6 years ago
swabhs / scaffolding
Frame-Semantic and PropBank Semantic Role Labeling with Syntactic Scaffolding.
☆50Updated 4 years ago
bigscience-workshop / evaluation
Code and Data for Evaluation WG
☆42Updated 3 years ago