mlcommons / dynabench
☆21Updated this week
Alternatives and similar repositories for dynabench:
Users that are interested in dynabench are comparing it to the libraries listed below
- An unofficial implementation of the Infini-gram model proposed by Liu et al. (2024)☆31Updated 10 months ago
- Documentation effort for the BookCorpus dataset☆34Updated 3 years ago
- ☆77Updated last year
- ☆23Updated 3 months ago
- Code for SaGe subword tokenizer (EACL 2023)☆24Updated 4 months ago
- ☆34Updated 2 weeks ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- Repo to hold code and track issues for the collection of permissively licensed data☆23Updated 2 weeks ago
- This repository contains code and data for the EMNLP 2022 paper "CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about…☆10Updated 2 years ago
- Are foundation LMs multilingual knowledge bases? (EMNLP 2023)☆19Updated last year
- A list of resources dedicated to compositionality☆14Updated 6 years ago
- ☆44Updated 5 months ago
- A summarization dataset consisting of over 17k open access business journal articles.☆10Updated 4 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆48Updated last year
- Embedding Recycling for Language models☆38Updated last year
- arXiv plain text extraction☆41Updated 2 years ago
- This repository includes the masking vocabulary used in the ICLR 2021 spotlight PMI-Masking paper☆14Updated 3 years ago
- ☆19Updated last year
- codebase release for EMNLP2023 paper publication☆19Updated last year
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆33Updated 11 months ago
- Hugging Face and Pyserini interoperability☆20Updated last year
- ☆46Updated last year
- ☆90Updated 2 years ago
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆41Updated 3 years ago
- A highly sophisticated sequence-to-sequence model for code generation☆40Updated 3 years ago
- Mechanistic Interpretability for Transformer Models☆50Updated 2 years ago
- Code for NAACL 2022 paper "Reframing Human-AI Collaboration for Generating Free-Text Explanations"☆31Updated last year
- Code for "CyberWallE at SemEval-2020 Task 11: An Analysis of Feature Engineering for Ensemble Models for Propaganda Detection" (V. Blasch…☆9Updated 4 years ago
- ☆12Updated 3 years ago
- The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".☆69Updated last year