allenai / c4-documentationLinks
β27Updated 4 years ago
Alternatives and similar repositories for c4-documentation
Users that are interested in c4-documentation are comparing it to the libraries listed below
Sorting:
- β11Updated 3 years ago
- π€ Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.β82Updated 3 years ago
- β14Updated 9 months ago
- β29Updated 3 years ago
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).β20Updated 3 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 laβ¦β48Updated last year
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (httpsβ¦β44Updated 11 months ago
- πΎ Universal, customizable and deployable fine-grained evaluation for text generation.β23Updated last year
- Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021β29Updated 2 years ago
- Embedding Recycling for Language modelsβ39Updated 2 years ago
- CCQA A New Web-Scale Question Answering Dataset for Model Pre-Trainingβ32Updated 2 years ago
- Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://aβ¦β46Updated 2 years ago
- β35Updated last year
- Exploring Few-Shot Adaptation of Language Models with Tablesβ24Updated 2 years ago
- Documentation effort for the BookCorpus datasetβ34Updated 4 years ago
- EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collectionsβ50Updated 3 years ago
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyerβ54Updated 2 years ago
- Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arxβ¦β136Updated last year
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"β28Updated 3 years ago
- Apps built using Inspired Cognition's Critique.β58Updated 2 years ago
- KETOD Knowledge-Enriched Task-Oriented Dialogueβ32Updated 2 years ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Faceβ32Updated last year
- Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loβ¦β39Updated last year
- This is the official PyTorch repo for "UNIREX: A Unified Learning Framework for Language Model Rationale Extraction" (ICML 2022).β25Updated 2 years ago
- β25Updated 2 years ago
- Starbucks: Improved Training for 2D Matryoshka Embeddingsβ21Updated 2 weeks ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning Pβ¦β34Updated last year
- Efficient Memory-Augmented Transformersβ34Updated 2 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.β93Updated 2 years ago
- β53Updated 3 years ago