mainlp/awesome-human-label-variation

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mainlp/awesome-human-label-variation)

mainlp / awesome-human-label-variation

A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, accompanying The 'Problem' of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation (EMNLP 2022)

☆102

Alternatives and similar repositories for awesome-human-label-variation

Users that are interested in awesome-human-label-variation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cardiffnlp / dialz
View on GitHub
The official repo for the Dialz Python library - a toolkit for steering vector research.
☆27Mar 26, 2026Updated 3 months ago
ahmetustun / hyperx
View on GitHub
☆21Dec 5, 2022Updated 3 years ago
mainlp / germanic-lrl-corpora
View on GitHub
Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resource…
☆28Feb 16, 2026Updated 5 months ago
google-research-datasets / dices-dataset
View on GitHub
This repository contains two datasets with multi-turn adversarial conversations generated by human agents interacting with a dialog model…
☆35Jul 16, 2024Updated 2 years ago
kowndinya-renduchintala / POSIX
View on GitHub
POSIX: A Prompt Sensitivity Index for Language Models
☆13Nov 13, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
naverlabseurope / ALPS2024-MT-LAB
View on GitHub
CD20200004 from 01/01/2021 to 31/12/2023 - LIG UGA - Python Notebook and Models for the MT Lab @ ALPS 2022
☆13Apr 1, 2024Updated 2 years ago
huggingface / that_is_good_data
View on GitHub
☆65Aug 7, 2023Updated 2 years ago
kite99520 / DialSummEval
View on GitHub
Resources for paper "DialSummEval: Revisiting summarization evaluation for dialogues"
☆14Jul 22, 2025Updated last year
simonasnow / MultilingualPerspectivistNLU
View on GitHub
☆10May 30, 2024Updated 2 years ago
INK-USC / CrossFit
View on GitHub
Code for paper "CrossFit : A Few-shot Learning Challenge for Cross-task Generalization in NLP" (https://arxiv.org/abs/2104.08835)
☆113Apr 28, 2022Updated 4 years ago
adapter-hub / hgiyt
View on GitHub
Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"
☆28Oct 3, 2021Updated 4 years ago
facebookresearch / lss_eval
View on GitHub
This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…
☆31Aug 25, 2023Updated 2 years ago
UKPLab / nessie
View on GitHub
Automatically detect errors in annotated corpora.
☆48Sep 8, 2023Updated 2 years ago
G-Research / dgraph-dbpedia
View on GitHub
Pre-processing DBpedia datasets to load into Dgraph
☆13Mar 6, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SLAB-NLP / BUG
View on GitHub
A Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation, Levy et al., Findings of EMNLP 2021
☆14Apr 3, 2022Updated 4 years ago
EyeBench / eyebench
View on GitHub
EyeBench: Predictive Modeling from Eye Movements in Reading
☆17Apr 6, 2026Updated 3 months ago
conversationai / unhealthy-conversations
View on GitHub
A corpus of comments tagged for multiple attributes of unhealthiness.
☆37Mar 25, 2021Updated 5 years ago
hectormartinez / ud_unsup_parser
View on GitHub
☆22Jun 22, 2022Updated 4 years ago
NPoe / neural-nlp-explanation-experiment
View on GitHub
☆14Jun 8, 2018Updated 8 years ago
dmg-illc / JUDGE-BENCH
View on GitHub
☆40Jul 24, 2025Updated 11 months ago
xu1998hz / SEScore
View on GitHub
This repo contains all the codes for SEScore implementation
☆15Mar 3, 2025Updated last year
machamp-nlp / machamp
View on GitHub
Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/
☆91Jun 3, 2026Updated last month
fulifeng / Counterfactual_Reasoning_Model
View on GitHub
The source code of "Empowering Language Understanding with Counterfactual Reasoning" (ACL'21)
☆11Sep 3, 2021Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
cleanlab / multiannotator-benchmarks
View on GitHub
Benchmarking algorithms for assessing quality of data labeled by multiple annotators
☆34Dec 3, 2025Updated 7 months ago
umanlp / RedditBias
View on GitHub
Code & Data for the paper "RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models"
☆32May 31, 2021Updated 5 years ago
cambridgeltl / ACL2022_tutorial_multilingual_dialogue
View on GitHub
Materials for "Natural Language Processing for Multilingual Task-Oriented Dialogue" Tutorial at ACL 2022
☆14May 21, 2022Updated 4 years ago
mmmaurer / elfen
View on GitHub
A python package to efficiently extract linguistic features for text/NLP datasets
☆39Jun 8, 2026Updated last month
Perez-AlmendrosC / dontpatronizeme
View on GitHub
☆34Feb 25, 2026Updated 4 months ago
js-d / sim_metric
View on GitHub
☆36Oct 3, 2023Updated 2 years ago
Mckysse / GAIN
View on GitHub
Winner system (USTC-NELSLIP) of SemEval 2022 MultiCoNER shared task on 3 tracks (Chinese, Bangla, Code-Mixed).
☆14Nov 15, 2022Updated 3 years ago
sixhobbits / yelp-dataset-2017
View on GitHub
Submission to the Yelp Dataset Challenge 2017
☆15Jun 30, 2017Updated 9 years ago
wuningxi / Talks
View on GitHub
Slides from previous talks.
☆29Nov 23, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
dadelani / menyo-20k_MT
View on GitHub
☆11Jul 12, 2021Updated 5 years ago
sebastianGehrmann / CausalMediationAnalysis
View on GitHub
Code for the paper "Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias"
☆81Aug 25, 2021Updated 4 years ago
writing-assistant / writing-assistant.github.io
View on GitHub
☆18Sep 3, 2024Updated last year
adapter-hub / efficient-task-transfer
View on GitHub
Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021
☆37Dec 21, 2021Updated 4 years ago
fullflu / pydtr
View on GitHub
Python library of Dynamic Treatment Regimes
☆10Oct 26, 2020Updated 5 years ago
Aarhus-Psychiatry-Research / timeseriesflattener
View on GitHub
Converting irregularly spaced time series, such as eletronic health records, into dataframes for tabular classification.
☆20Jun 17, 2025Updated last year
cfwelch / longitudinal_dialog
View on GitHub
Code for publications related to longitudinal dialog research.
☆11May 23, 2019Updated 7 years ago