surge-ai / profanity
The world's largest profanity list.
☆202Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for profanity
- The world's largest social media toxicity dataset.☆176Updated 2 years ago
- Blazingly fast cleaning swear words (and their leetspeak) in strings☆211Updated 6 months ago
- Testing and training detection models for emoji-based hate speech.☆23Updated 2 years ago
- Pipeline to generate the Standardized Project Gutenberg Corpus☆159Updated 10 months ago
- A multilingual lexicon of words to hurt.☆80Updated 2 weeks ago
- ☆25Updated 10 months ago
- Cleans Reddit Text Data☆81Updated 4 years ago
- Topic Inference with Zeroshot models☆61Updated last year
- Conversational text Analysis using various NLP techniques☆178Updated last year
- A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.☆215Updated last year
- DeEpLearning models for MultIlingual haTespeech (DELIMIT): Benchmarking multilingual models across 9 languages and 16 datasets.☆107Updated last year
- A corpus of comments tagged for multiple attributes of unhealthiness.☆34Updated 3 years ago
- Fixes contractions such as `you're` to `you are`☆312Updated 2 years ago
- ☆44Updated 2 years ago
- Datasets for Hate Speech Detection☆115Updated last year
- Hate speech dataset from Stormfront forum manually labelled at sentence level.☆164Updated 4 years ago
- Code and data for the paper, "Automatically Neutralizing Subjective Bias in Text"☆196Updated 3 months ago
- Hate Speech Detection Library for Python.☆189Updated last week
- [LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweeban…☆102Updated 10 months ago
- YT_subtitles - extracts subtitles from YouTube videos to raw text for Language Model training☆38Updated 4 years ago
- 💫 SpaCy wrapper for ConceptNet 💫☆88Updated last year
- A repository with several curated datasets of counter-narratives to fight online hate speech.☆86Updated last year
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- A Directory of Online Newspaper Sources for 70+ Languages☆28Updated 3 years ago
- Full list of bad words and top swear words banned by Google.☆609Updated 2 months ago
- Official repository of the Hate Speech Detection Tasks at Evalita☆12Updated 3 years ago
- Build a dialog dataset from online books in many languages☆72Updated 2 years ago
- Catalog of abusive language data (PLoS 2020)☆304Updated 5 months ago
- Semantic Orientation Calculator for Sentiment Analysis☆52Updated 2 years ago
- Measure the readability of a given text using surface characteristics☆72Updated last year