LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words)

LDNOOBW / List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words

List of Dirty, Naughty, Obscene, and Otherwise Bad Words

☆3,408

Alternatives and similar repositories for List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words

Users that are interested in List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

coffee-and-fun / google-profanity-words
View on GitHub
Full list of bad words and top swear words banned by Google.
☆705Jun 10, 2026Updated last month
zautumnz / profane-words
View on GitHub
A very long list of English profanity.
☆316Mar 8, 2026Updated 4 months ago
LDNOOBW / naughty-words-js
View on GitHub
An npm/bower package to use the List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
☆75Jul 13, 2020Updated 6 years ago
reimertz / curse-words
View on GitHub
☆32Aug 10, 2014Updated 11 years ago
google-research / text-to-text-transfer-transformer
View on GitHub
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"
☆6,541Jul 8, 2026Updated 3 weeks ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
minimaxir / big-list-of-naughty-strings
View on GitHub
The Big List of Naughty Strings is a list of strings which have a high probability of causing issues when used as user-input data.
☆47,701Apr 18, 2024Updated 2 years ago
google / sentencepiece
View on GitHub
Unsupervised text tokenizer for Neural Network-based text generation.
☆11,996Updated this week
facebookresearch / fairseq
View on GitHub
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
☆32,248Sep 30, 2025Updated 10 months ago
facebookresearch / fastText
View on GitHub
Library for fast text representation and classification.
☆26,549Mar 22, 2024Updated 2 years ago
facebookresearch / cc_net
View on GitHub
Tools to download and cleanup Common Crawl data
☆1,047Apr 25, 2023Updated 3 years ago
sebastianruder / NLP-progress
View on GitHub
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the mo…
☆22,955Jul 28, 2024Updated 2 years ago
huggingface / transformers
View on GitHub
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal model…
☆163,169Updated this week
first20hours / google-10000-english
View on GitHub
This repo contains a list of the 10,000 most common English words in order of frequency, as determined by n-gram frequency analysis of th…
☆4,437May 17, 2023Updated 3 years ago
chucknorris-io / swear-words
View on GitHub
💩 Profanity means swear words. The adjective is 'profane'. Profanities can also be called curse ("cuss") words, dirty words, bad words, …
☆124Aug 27, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
huggingface / sentence-transformers
View on GitHub
State-of-the-Art Embeddings, Retrieval, and Reranking
☆18,953Updated this week
words / cuss
View on GitHub
🤬 Map of profane words to a rating of sureness
☆267Apr 29, 2023Updated 3 years ago
allenai / allennlp
View on GitHub
An open-source NLP research library, built on PyTorch.
☆11,890Nov 22, 2022Updated 3 years ago
makcedward / nlpaug
View on GitHub
Data augmentation for NLP
☆4,664Updated this week
WikiExtractor / wikiextractor
View on GitHub
A tool for extracting plain text from Wikipedia dumps
☆3,997Updated this week
google-research / bert
View on GitHub
TensorFlow code and pre-trained models for BERT
☆40,058Jul 23, 2024Updated 2 years ago
facebookresearch / faiss
View on GitHub
A library for efficient similarity search and clustering of dense vectors.
☆40,621Updated this week
microsoft / unilm
View on GitHub
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆22,175Jan 23, 2026Updated 6 months ago
NVIDIA / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆17,265Updated this week
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Mimino666 / langdetect
View on GitHub
Port of Google's language-detection library to Python.
☆1,898Mar 3, 2025Updated last year
huawei-noah / Pretrained-Language-Model
View on GitHub
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
☆3,163Jan 22, 2024Updated 2 years ago
jina-ai / clip-as-service
View on GitHub
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
☆12,834Jan 23, 2024Updated 2 years ago
flairNLP / flair
View on GitHub
A very simple framework for state-of-the-art Natural Language Processing (NLP)
☆14,382Oct 27, 2025Updated 9 months ago
zihangdai / xlnet
View on GitHub
XLNet: Generalized Autoregressive Pretraining for Language Understanding
☆6,185May 28, 2023Updated 3 years ago
google / BIG-bench
View on GitHub
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
☆3,249Jul 19, 2024Updated 2 years ago
princeton-nlp / SimCSE
View on GitHub
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
☆3,655Oct 16, 2024Updated last year
facebookresearch / XLM
View on GitHub
PyTorch original implementation of Cross-lingual Language Model Pretraining.
☆2,923Feb 14, 2023Updated 3 years ago
bigscience-workshop / promptsource
View on GitHub
Toolkit for creating, sharing and using natural language prompts.
☆3,027Oct 23, 2023Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
tatsu-lab / stanford_alpaca
View on GitHub
Code and documentation to train Stanford's Alpaca models, and generate the data.
☆30,244Jul 17, 2024Updated 2 years ago
google-research / deduplicate-text-datasets
View on GitHub
☆1,269Jul 30, 2024Updated 2 years ago
spencermountain / compromise
View on GitHub
modest natural-language processing
☆12,149Jul 20, 2026Updated last week
facebookresearch / metaseq
View on GitHub
Repo for external large-scale work
☆6,550Apr 27, 2024Updated 2 years ago
togethercomputer / RedPajama-Data
View on GitHub
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
☆4,975Jun 3, 2026Updated last month
deepspeedai / DeepSpeed
View on GitHub
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
☆42,835Updated this week
google-research / multilingual-t5
View on GitHub
☆1,294Dec 15, 2022Updated 3 years ago