julian-risch / toxic-comment-collection
Code for our WOAH@ACL 2021 Paper on Data Integration for Toxic Comment Classification: Making More Than 40 Datasets Easily Accessible in One Unified Format
☆27Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for toxic-comment-collection
- ☆38Updated last year
- Contrastive Fact Verification☆70Updated 2 years ago
- ☆73Updated 3 years ago
- A Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation, Levy et al., Findings of EMNLP 2021☆12Updated 2 years ago
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆20Updated last year
- ☆50Updated 2 years ago
- A repository with several curated datasets of counter-narratives to fight online hate speech.☆86Updated last year
- ☆21Updated 6 months ago
- Pretraining scripts for BART transformer model☆11Updated last year
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆31Updated 2 years ago
- This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".☆86Updated 3 years ago
- ☆27Updated last year
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆46Updated 2 years ago
- Data set for LREC 2020 paper "I Feel Offended, Don't Be Abusive!"☆18Updated last year
- Dataset + classifier tools to study social perception biases in natural language generation☆67Updated last year
- Code for CAET5☆23Updated last year
- Data and code for our paper "Exploring and Predicting Transferability across NLP Tasks", to appear at EMNLP 2020.☆48Updated 3 years ago
- FRANK: Factuality Evaluation Benchmark☆52Updated last year
- Statistics on multilingual datasets☆17Updated 2 years ago
- EMNLP 2021 Tutorial: Multi-Domain Multilingual Question Answering☆38Updated 3 years ago
- Code and test data for "On Measuring Bias in Sentence Encoders", to appear at NAACL 2019.☆54Updated 3 years ago
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆26Updated 3 years ago
- ☆67Updated 3 years ago
- ☆31Updated last year
- code for our EACL 2021 paper: "Challenges in Automated Debiasing for Toxic Language Detection" by Xuhui Zhou, Maarten Sap, Swabha Swayamd…☆19Updated 3 years ago
- ☆38Updated last year
- Codebase, data and models for the Keep it Simple paper at ACL2021☆36Updated last year
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆84Updated 2 years ago
- Code for Massive-scale Decoding for Text Generation using Lattices☆42Updated 2 years ago
- ☆16Updated 2 years ago