surge-ai / toxicity
The world's largest social media toxicity dataset.
☆176Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for toxicity
- The world's largest profanity list.☆201Updated 7 months ago
- This repository contains a dataset for hate speech detection on social media platforms.☆66Updated last year
- Granular Viewer of Sentiments Between Entities in Massively Large Documents and Collections of Texts, powered by AREkit☆37Updated 3 months ago
- Conversational text Analysis using various NLP techniques☆178Updated last year
- MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert…☆48Updated 3 years ago
- negate_sentence(A Python module that doesn't negate sentences.)☆27Updated 3 weeks ago
- The AI Knowledge Editor☆182Updated 2 years ago
- Find and fix bugs in natural language machine learning models using adaptive testing.☆182Updated 6 months ago
- Creating the tools and data sets necessary to evaluate vulnerabilities in LLMs.☆21Updated last year
- NUBIA (NeUral Based Interchangeability Assessor) is a new SoTA evaluation metric for text generation☆52Updated last year
- Neural Search☆325Updated 5 months ago
- Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆89Updated last week
- 💫 SpaCy wrapper for ConceptNet 💫☆88Updated last year
- DeEpLearning models for MultIlingual haTespeech (DELIMIT): Benchmarking multilingual models across 9 languages and 16 datasets.☆107Updated last year
- Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).☆69Updated 2 months ago
- ☆13Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆151Updated 5 months ago
- A set of tools for leveraging pre-trained embeddings, active learning and model explainability for effecient document classification☆29Updated 2 years ago
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆153Updated 10 months ago
- Documentation effort for the BookCorpus dataset☆31Updated 3 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆46Updated 3 years ago
- The Python library with command line tools to interact with Dynabench(https://dynabench.org/), such as uploading models.☆55Updated 2 years ago
- diagNNose is a Python library that facilitates a broad set of tools for analysing hidden activations of neural models.☆81Updated last year
- ☆22Updated 2 years ago
- ACL 2022☆124Updated 11 months ago
- Corresponding code repo for the paper at COLING 2020 - ARGMIN 2020: "DebateSum: A large-scale argument mining and summarization dataset"☆53Updated 2 years ago
- An open-source text summarization toolkit for non-experts. EMNLP'2021 Demo☆268Updated last year
- This AI Does Not Exist: generate realistic descriptions of made-up machine learning models.☆146Updated 2 years ago
- A python package for benchmarking interpretability techniques on Transformers.☆211Updated last month
- Labelling platform for text using weak supervision.☆260Updated 2 years ago