timoschick / dino
This repository contains the code for "Generating Datasets with Pretrained Language Models".
☆188Updated 3 years ago
Alternatives and similar repositories for dino
Users that are interested in dino are comparing it to the libraries listed below
Sorting:
- Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"☆202Updated 3 years ago
- Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…☆136Updated last year
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆154Updated last year
- ☆182Updated last year
- ☆76Updated 3 years ago
- ☆97Updated 2 years ago
- Search Engines with Autoregressive Language models☆285Updated 2 years ago
- [EMNLP 2021] Improving and Simplifying Pattern Exploiting Training☆154Updated 2 years ago
- ☆97Updated 2 years ago
- A multilingual version of MS MARCO passage ranking dataset☆145Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago
- A BART version of an open-domain QA model in a closed-book setup☆119Updated 4 years ago
- MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance☆206Updated last year
- LM Pretraining with PyTorch/TPU☆134Updated 5 years ago
- An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"☆119Updated 3 years ago
- Hyperparameter Search for AllenNLP☆139Updated 2 months ago
- SacreROUGE is a library dedicated to the use and development of text generation evaluation metrics with an emphasis on summarization.☆143Updated 2 years ago
- Interpretable Evaluation for (Almost) All NLP Tasks☆195Updated 2 years ago
- A framework for few-shot evaluation of autoregressive language models.☆103Updated 2 years ago
- Viewer for the 🤗 datasets library.☆84Updated 3 years ago
- On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines☆136Updated last year
- The autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.☆102Updated 2 years ago
- ☆65Updated last year
- A repo to explore different NLP tasks which can be solved using T5☆172Updated 4 years ago
- Efficient Attention for Long Sequence Processing☆93Updated last year
- [EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction☆119Updated 3 years ago
- Dense hybrid representations for text retrieval☆62Updated 2 years ago
- Pipeline for pulling and processing online language model pretraining data from the web☆177Updated last year
- The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization☆156Updated 2 years ago
- QED: A Framework and Dataset for Explanations in Question Answering☆116Updated 3 years ago