ricsinaruto / gutenberg-dialog
Build a dialog dataset from online books in many languages
β72Updated 2 years ago
Alternatives and similar repositories for gutenberg-dialog:
Users that are interested in gutenberg-dialog are comparing it to the libraries listed below
- Code and datasets of "Multilingual Extractive Reading Comprehension by Runtime Machine Translation"β40Updated 6 years ago
- πΈ KERMIT - A lightweight library to encode and interpret Universal Syntactic Embeddingsβ58Updated 2 years ago
- A program to choose transfer languages for cross-lingual learningβ72Updated last year
- Data and code for Kang et al., EMNLP 2019's paper titled "(Male, Bachelor) and (Female, Ph.D) have different connotations: Parallelly Annβ¦β29Updated 5 years ago
- Codebase for probing and visualizing multilingual models.β47Updated 4 years ago
- A Benchmark Dataset for Understanding Disfluencies in Question Answeringβ62Updated 3 years ago
- β56Updated 3 years ago
- Dual Encoders for State-of-the-art Natural Language Processing.β61Updated 2 years ago
- The Benchmark of Linguistic Minimal Pairsβ149Updated 2 years ago
- EMNLP 2021 Tutorial: Multi-Domain Multilingual Question Answeringβ38Updated 3 years ago
- β92Updated last year
- Code to reproduce the experiments from the paper.β100Updated last year
- Lexical Simplification with Pretrained Encodersβ70Updated 4 years ago
- One million English sentences, each split into two sentences that together preserve the original meaning, extracted from Wikipedia edits.β123Updated 5 years ago
- A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformationsβ55Updated 2 years ago
- The Universal Decompositional Semantics (UDS) dataset and the Decomp toolkitβ57Updated last year
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating themβ37Updated 2 years ago
- A framework for building semantic parsers (including neural module networks) with AllenNLP, built by the authors of AllenNLPβ107Updated 2 years ago
- SacreROUGE is a library dedicated to the use and development of text generation evaluation metrics with an emphasis on summarization.β141Updated 2 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)β48Updated 3 years ago
- β29Updated last year
- β102Updated 3 years ago
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressionsβ26Updated 4 years ago
- β32Updated 3 years ago
- Assessing syntactic abilities of BERTβ39Updated 5 years ago
- Implementation of Marge, Pre-training via Paraphrasing, in Pytorchβ75Updated 4 years ago
- A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contaiβ¦β106Updated 5 years ago
- Contains data/code for the paper "Neural Syntactic Preordering for Controlled Paraphrase Generation" (ACL 2020).β76Updated 7 months ago
- Resources for the "CTRLsum: Towards Generic Controllable Text Summarization" paperβ146Updated last year
- β75Updated 3 years ago