Code for our WOAH@ACL 2021 Paper on Data Integration for Toxic Comment Classification: Making More Than 40 Datasets Easily Accessible in One Unified Format
☆30Nov 25, 2021Updated 4 years ago
Alternatives and similar repositories for toxic-comment-collection
Users that are interested in toxic-comment-collection are comparing it to the libraries listed below
Sorting:
- [ACL 2023] Counterspeeches up my sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generati…☆10Sep 23, 2023Updated 2 years ago
- A repo to keep all resources about interpretability in NLP organised and up to date☆12Nov 22, 2020Updated 5 years ago
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆12Nov 6, 2023Updated 2 years ago
- Code for FACTOID dataset paper in LREC 2022☆18Dec 19, 2022Updated 3 years ago
- Documenting large text datasets 🖼️ 📚☆14Dec 17, 2024Updated last year
- Hugging Face and Pyserini interoperability☆19May 18, 2023Updated 2 years ago
- Can we use explanations to improve hate speech models? Our paper accepted at AAAI 2021 tries to explore that question.☆233Jun 12, 2023Updated 2 years ago
- ☆15Apr 10, 2018Updated 7 years ago
- Code for "Goodtriever: Toxicity Mitigation with Retrieval-augmented Language Models"☆25May 30, 2024Updated last year
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Aug 10, 2024Updated last year
- Detect toxic spans in toxic texts☆71Jun 12, 2023Updated 2 years ago
- Explaining neural decisions contrastively to alternative decisions.☆25Mar 18, 2021Updated 4 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆20Oct 23, 2023Updated 2 years ago
- DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning☆23Aug 23, 2023Updated 2 years ago
- This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".☆89Aug 20, 2021Updated 4 years ago
- Data and code for the paper "The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems"☆21Jul 18, 2023Updated 2 years ago
- TGLS: Unsupervised Text Generation by Learning from Search☆25Jan 5, 2021Updated 5 years ago
- A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)☆27Sep 12, 2021Updated 4 years ago
- SemEval 2019 - Task 6 - Identifying and Categorizing Offensive Language in Social Media☆26Feb 26, 2019Updated 7 years ago
- SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles☆34Sep 30, 2022Updated 3 years ago
- Official repository for Characterization of tumor heterogeneity through segmentation-free representation learning on multiplexed imaging …☆14Sep 28, 2025Updated 5 months ago
- This repository contains two independent news datasets used in the 2017 study: "This Just In: Fake News Packs a Lot in Title, Uses Simple…☆30Apr 7, 2017Updated 8 years ago
- Official Code and Data repository of our ACL 2021 paper X-FACT: A New Benchmark Dataset for Multilingual Fact Checking.☆27Oct 4, 2024Updated last year
- This repository contains a dataset for hate speech detection on social media platforms.☆74Dec 9, 2022Updated 3 years ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Jan 9, 2025Updated last year
- Vintage Typography with Web Fonts☆14Dec 22, 2015Updated 10 years ago
- An Educational Framework Based on PyTorch for Deep Learning Education and Exploration☆10Dec 24, 2023Updated 2 years ago
- ☆106Oct 16, 2025Updated 4 months ago
- SPA: Efficient User-Preference Alignment against Uncertainty in Medical Image Segmentation (ICCV 2025)☆14Sep 26, 2025Updated 5 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Jan 12, 2024Updated 2 years ago
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- RealTime Motion Capture Toolbox for Matlab☆10Apr 11, 2016Updated 9 years ago
- The Spanish Fake News Corpus contains a collection of 971 news divided into 491 real news and 480 fake news. The corpus covers news from …☆39Sep 21, 2021Updated 4 years ago
- ☆36Oct 1, 2020Updated 5 years ago
- A Multilingual Multi-Target Dataset for Stance Detection☆41Jun 17, 2024Updated last year
- ☆35Dec 14, 2023Updated 2 years ago
- A review of class imbalanced problems using data augumentation and ensemble learning☆10Mar 15, 2023Updated 2 years ago
- Fortifying Toxic Speech Detectors Against Veiled Toxicity☆11Oct 21, 2020Updated 5 years ago
- Python script to assemble individual Tweets from a public Twitter stream (either Gnip activity-streams format or original Twitter API for…☆12Aug 30, 2016Updated 9 years ago