LDNOOBW / List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-WordsLinks
List of Dirty, Naughty, Obscene, and Otherwise Bad Words
☆3,106Updated 10 months ago
Alternatives and similar repositories for List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words
Users that are interested in List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words are comparing it to the libraries listed below
Sorting:
- A python tool for evaluating the quality of sentence embeddings.☆2,108Updated last year
- Public repo for HF blog posts☆2,982Updated this week
- Stand-alone language identification system☆2,394Updated 5 years ago
- Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)☆1,212Updated 8 months ago
- A natural language modeling framework based on PyTorch☆6,325Updated 2 years ago
- Tools to download and cleanup Common Crawl data☆1,013Updated 2 years ago
- Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017☆814Updated 2 years ago
- Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)☆5,906Updated 2 years ago
- Language-Agnostic SEntence Representations☆3,644Updated last year
- Open Location Code is a library to generate short codes, called "plus codes", that can be used as digital addresses where street addresse…☆4,204Updated this week
- 🤗 Evaluate: A library for easily evaluating machine learning models and datasets.☆2,232Updated 5 months ago
- Compact Language Detector 2☆864Updated 4 years ago
- A curated list of speech and natural language processing resources☆2,205Updated 6 years ago
- A dataset of 200k English plaintext jokes.☆619Updated 2 years ago
- Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons☆1,150Updated 3 months ago
- Collection of papers and resources for data augmentation for NLP.☆828Updated 2 years ago
- Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transfor…☆1,063Updated 2 months ago
- Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.☆1,000Updated 10 months ago
- Accurately generate all possible forms of an English word e.g "election" --> "elect", "electoral", "electorate" etc.☆632Updated 3 years ago
- Evaluation code for various unsupervised automated metrics for Natural Language Generation.☆1,385Updated 9 months ago
- A linter for prose.☆4,422Updated this week
- BLEURT is a metric for Natural Language Generation based on transfer learning.☆737Updated last year
- Emoji for everyone. https://twemoji.twitter.com/☆17,093Updated 10 months ago
- Locally run an Instruction-Tuned Chat-Style LLM☆10,227Updated 2 years ago
- Toolkit for creating, sharing and using natural language prompts.☆2,882Updated last year
- ☆835Updated 2 years ago
- TextBox 2.0 is a text generation library with pre-trained language models☆1,088Updated last year
- Twitter Text Libraries. This code is used at Twitter to tokenize and parse text to meet the expectations for what can be used on the plat…☆3,104Updated last year
- Crawl BookCorpus☆835Updated last year
- ☆1,279Updated 2 years ago