machelreid / m2d2View external linksLinks
M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer
☆54Nov 21, 2022Updated 3 years ago
Alternatives and similar repositories for m2d2
Users that are interested in m2d2 are comparing it to the libraries listed below
Sorting:
- ☆44Sep 16, 2020Updated 5 years ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Mar 18, 2023Updated 2 years ago
- [ACL 2021] Learning to Perturb Word Embeddings for Out-of-distribution QA☆16May 11, 2022Updated 3 years ago
- Convenient Text-to-Text Training for Transformers☆19Dec 10, 2021Updated 4 years ago
- Learning Semantic Parsers from Denotations with Latent Structured Alignments and Abstract Programs(EMNLP2019)☆19Dec 3, 2019Updated 6 years ago
- DEMix Layers for Modular Language Modeling☆54Aug 23, 2021Updated 4 years ago
- Official code for LEWIS, from: "LEWIS: Levenshtein Editing for Unsupervised Text Style Transfer", ACL-IJCNLP 2021 Findings by Machel Rei…☆31Oct 24, 2022Updated 3 years ago
- DialogueCSE: Dialogue-based Contrastive Learning of Sentence Embeddings☆19Nov 24, 2021Updated 4 years ago
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆209Aug 31, 2021Updated 4 years ago
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Aug 10, 2024Updated last year
- Query-focused summarization data☆43Feb 17, 2023Updated 2 years ago
- Data and code for the paper "The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems"☆21Jul 18, 2023Updated 2 years ago
- Exploring Few-Shot Adaptation of Language Models with Tables☆24Aug 22, 2022Updated 3 years ago
- Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…☆137Aug 2, 2023Updated 2 years ago
- Code for the paper "Modeling Information Change in Science Communication with Semantically Matched Paraphrases" from EMNLP 2022☆13Oct 20, 2022Updated 3 years ago
- Benchmark API for Multidomain Language Modeling☆25Aug 26, 2022Updated 3 years ago
- Repository for the code associated with the paper: Unsupervised Extractive Summarization using Mutual Information☆25Sep 11, 2021Updated 4 years ago
- Sequence modeling with Mega.☆303Jan 28, 2023Updated 3 years ago
- ☆14Dec 9, 2021Updated 4 years ago
- ☆11Oct 12, 2023Updated 2 years ago
- All-in-one repository for Fine-tuning & Pretraining (Large) Language Models☆15Mar 8, 2023Updated 2 years ago
- Benchmark dataset for the evaluation of scientific article representations on the task of citation recommendation across various scientif…☆12Oct 21, 2022Updated 3 years ago
- A template primarily for PhD theses but also suitable for Bachelor's or Master's theses☆11Nov 10, 2021Updated 4 years ago
- The original implementation of Min et al. "Nonparametric Masked Language Modeling" (paper https//arxiv.org/abs/2212.01349)☆158Jan 6, 2023Updated 3 years ago
- Code of ACL 2022 paper Debiased Contrastive Learning of Unsupervised Sentence Representations☆32Mar 16, 2022Updated 3 years ago
- Code of NAACL 2022 "Efficient Hierarchical Domain Adaptation for Pretrained Language Models" paper.☆32Sep 26, 2023Updated 2 years ago
- Code for NeurIPS 2023 paper "Non-autoregressive Machine Translation with Probabilistic Context-free Grammar".☆12Jan 4, 2024Updated 2 years ago
- Poincaré Event Temporal Embeddings and Hyperbolic GRU for Event TempRel Extraction☆11Nov 8, 2021Updated 4 years ago
- ☆13Feb 7, 2023Updated 3 years ago
- ☆15Apr 12, 2023Updated 2 years ago
- MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale☆14Mar 22, 2021Updated 4 years ago
- Code for embedding and retrieval research.☆16Oct 24, 2023Updated 2 years ago
- Code Implementation for "NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models" (EMNLP …☆17Oct 17, 2023Updated 2 years ago
- Data for EMNLP 2022 paper "arXivEdits: Understanding the Human Revision Process in Scientific Writing".☆14Sep 30, 2023Updated 2 years ago
- triple-encoders is a library for contextualizing distributed Sentence Transformers representations.☆15Sep 3, 2024Updated last year
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆98Apr 26, 2023Updated 2 years ago
- ☆13Oct 2, 2023Updated 2 years ago
- ☆13Apr 24, 2022Updated 3 years ago
- ☆12Apr 18, 2019Updated 6 years ago