tscheepers / Wikipedia-Summary-Dataset

This dataset contains all titles and summaries (or introductions) of English Wikipedia articles, extracted in september of 2017. It could be useful if one wants to use the smaller, more concise, and more definitional summaries in their research. Or if one just wants to use a smaller but still diverse dataset for efficient training with resource …

☆56

Alternatives and similar repositories for Wikipedia-Summary-Dataset:

Users that are interested in Wikipedia-Summary-Dataset are comparing it to the libraries listed below

hiroki13 / span-based-srl
☆46Updated 5 years ago
TalSchuster / CrossLingualContextualEmb
Cross-Lingual Alignment of Contextual Word Embeddings
☆99Updated 4 years ago
jzhou316 / Unsupervised-Sentence-Summarization
Unsupervised sentence summarization by contextual matching
☆47Updated 3 years ago
google-research-datasets / noun-verb
This dataset contains naturally-occurring English sentences that feature non-trivial noun-verb ambiguity.
☆35Updated 5 years ago
mcdm / CommitmentBank
Materials related to our Sinn und Bedeutung 23 paper
☆38Updated 4 years ago
mandarjoshi90 / pair2vec
pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference
☆61Updated 2 years ago
cocoxu / SemEval-PIT2015
data and scripts for the shared task "Task 1: Paraphrase and Semantic Similarity in Twitter (PIT)" at SemEval 2015
☆44Updated 4 years ago
ghaddarAbs / WiNER
☆34Updated 3 years ago
jwieting / para-nmt-50m
Pre-trained models and code and data to train and use models from "Pushing the Limits of Paraphrastic Sentence Embeddings with Millions o…
☆101Updated last year
idiap / HAN_NMT
Document-Level Neural Machine Translation with Hierarchical Attention Networks
☆68Updated 2 years ago
dojoteef / synst
Source code to reproduce the results in the ACL 2019 paper "Syntactically Supervised Transformers for Faster Neural Machine Translation"
☆81Updated 2 years ago
MysteryVaibhav / robust_mtnt
Code for the paper "Improving Robustness of Machine Translation with Synthetic Noise"
☆21Updated 5 years ago
raosudha89 / clarification_question_generation_pytorch
Code and data for the paper: Answer-based Adversarial Training for Generating Clarification Questions
☆43Updated 4 years ago
cambridgeltl / xcopa
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning
☆101Updated 3 years ago
yzhangcs / crfpar
[ACL'20, IJCAI'20] Code for "Efficient Second-Order TreeCRF for Neural Dependency Parsing" and "Fast and Accurate Neural CRF Constituency…
☆77Updated 4 years ago
allenanie / DisExtract
The library that uses dependency parsing to preprocess text to train DisSent model
☆33Updated 4 years ago
uwnlp / open_type
☆70Updated 2 years ago
YueDongCS / EditNTS
This repo contains the code for our paper "EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit E…
☆57Updated 4 years ago
subramanyamdvss / UnsupNTS
Unsupervised Neural Text Simplification
☆32Updated 3 years ago
dykang / PASTEL
Data and code for Kang et al., EMNLP 2019's paper titled "(Male, Bachelor) and (Female, Ph.D) have different connotations: Parallelly Ann…
☆29Updated 4 years ago
google-research-datasets / discofuse
☆32Updated 3 years ago
conll / reference-coreference-scorers
This is the reference implementation of commonly used coreference metrics.
☆74Updated 6 years ago
tag-and-generate / politeness-dataset
Dataset for the politeness transfer task
☆37Updated 3 years ago
afshinrahimi / mmner
Massively Multilingual Transfer for NER
☆85Updated 3 years ago
tagoyal / sow-reap-paraphrasing
Contains data/code for the paper "Neural Syntactic Preordering for Controlled Paraphrase Generation" (ACL 2020).
☆76Updated 5 months ago
neulab / cmu-multinlp
Generalizing Natural Language Analysis through Span-relation Representations
☆91Updated 2 years ago
KaijuML / data-to-text-hierarchical
Code for A Hierarchical Model for Data-to-Text Generation (Rebuffel, Soulier, Scoutheeten, Gallinari; ECIR 2020)
☆83Updated last year
PKU-TANGENT / NeuralEDUSeg
A toolkit for discourse segmentation (EDU segmentation).
☆102Updated last year
jiacheng-xu / neu-compression-sum
Joint Extraction & Compression text Summarization
☆41Updated 5 years ago
WebNLG / GenerationEval
WebNLG+ Challenge 2020: Scripts to evaluate the RDF-to-text task with automatic metrics (BLEU, METEOR, chrF++, TER and BERT-Score)
☆16Updated 4 months ago