edahanoam / Awesome-Summarization-Datasets
Updating collection of summarization datasets in 100+ languages, based on our paper "The State and Fate of Summarization Datasets: A Survey".
☆14Updated last week
Alternatives and similar repositories for Awesome-Summarization-Datasets:
Users that are interested in Awesome-Summarization-Datasets are comparing it to the libraries listed below
- Repository for "Attribute First, then Generate: Locally-attributable Grounded Text Generation", ACL 2024☆29Updated 4 months ago
- ☆27Updated 5 months ago
- Fairseq tutorial☆17Updated 2 years ago
- GRUEN for Evaluating Linguistic Quality of Generated Text (EMNLP 2020 Findings)☆29Updated last month
- The Stanford Word Substitution (Swords) Benchmark☆32Updated 3 years ago
- The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Deco…☆34Updated last year
- A curated list of research papers and resources on Cultural LLM.☆42Updated 7 months ago
- Library for experimenting with state-of-the-art evaluation metrics like UScore☆12Updated last year
- A library for minimum Bayes risk (MBR) decoding☆37Updated 3 weeks ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 3 years ago
- ☆23Updated last year
- ☆24Updated 11 months ago
- How to finetune mbart using fairseq☆24Updated 4 years ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆41Updated last year
- Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"☆25Updated 4 months ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆71Updated last year
- ☆29Updated 2 years ago
- Quality Controlled Paraphrase Generation (ACL 2022)☆70Updated 2 years ago
- ☆25Updated 2 years ago
- Pretraining scripts for BART transformer model☆11Updated last year
- ☆13Updated 2 years ago
- [TMLR'23] Contrastive Search Is What You Need For Neural Text Generation☆119Updated 2 years ago
- This repository contains an extension of fairseq for pixel / visual representations for machine translation.☆34Updated last year
- ☆46Updated 3 years ago
- ☆82Updated 2 years ago
- Official implementation of the paper "IteraTeR: Understanding Iterative Revision from Human-Written Text" (ACL 2022)☆78Updated last year
- A library of translation-based text similarity measures☆25Updated last year
- PyTorch reimplementation of REALM and ORQA☆22Updated 3 years ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆75Updated 3 years ago
- Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters in Pretrained Language Models".☆58Updated 4 years ago