gentaiscool / miners
MINERS ⛏️: The semantic retrieval benchmark for evaluating multilingual language models. (EMNLP 2024 Findings)
☆11Updated last month
Related projects ⓘ
Alternatives and complementary repositories for miners
- ☆11Updated 2 years ago
- PyTorch reimplementation of the paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization"☆16Updated 3 years ago
- We construct and introduce DIALFACT, a testing benchmark dataset crowd-annotated conversational claims, paired with pieces of evidence fr…☆41Updated 2 years ago
- The implementation of the paper "Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters".☆17Updated 2 years ago
- The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".☆27Updated 3 years ago
- EMNLP 2021 Tutorial: Multi-Domain Multilingual Question Answering☆38Updated 3 years ago
- PyTorch implementation of NAACL 2021 paper "Multi-view Subword Regularization"☆24Updated 3 years ago
- ☆13Updated last year
- ☆25Updated 2 years ago
- Code for DS2 paper☆20Updated 2 years ago
- The official implemetation of "Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks" (NAACL 2022).☆43Updated last year
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆31Updated 2 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆26Updated 3 years ago
- Code repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"☆41Updated 2 years ago
- ☆44Updated last year
- Python source code for EMNLP 2021 Findings paper: "Subword Mapping and Anchoring Across Languages".☆13Updated 3 years ago
- Pre-training BART in Flax on The Pile dataset☆20Updated 3 years ago
- ☆40Updated 3 years ago
- ☆12Updated 8 months ago
- Official implementation of the ACL 2022 paper "Learning Non-Autoregressive Models from Search for Unsupervised Sentence Summarization"☆14Updated last year
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆72Updated 2 years ago
- The code implementation of the EMNLP2022 paper: DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Gene…☆25Updated last year
- KETOD Knowledge-Enriched Task-Oriented Dialogue☆31Updated last year
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆43Updated 3 months ago
- Materials for "Natural Language Processing for Multilingual Task-Oriented Dialogue" Tutorial at ACL 2022☆14Updated 2 years ago
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretraining☆12Updated 11 months ago
- Simple Questions Generate Named Entity Recognition Datasets (EMNLP 2022)☆76Updated last year
- ☆25Updated 2 years ago
- Code for "Unsupervised Enrichment of Persona-grounded Dialog with Background Stories", ACL 2021☆10Updated 3 years ago
- ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhi…☆49Updated 3 years ago