unitaryai / detoxifyLinks
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using β‘ Pytorch Lightning and π€ Transformers. For access to our API, please email us at contact@unitary.ai.
β1,071Updated 2 months ago
Alternatives and similar repositories for detoxify
Users that are interested in detoxify are comparing it to the libraries listed below
Sorting:
- Repository for TweetEvalβ377Updated 2 years ago
- Catalog of abusive language data (PLoS 2020)β314Updated last year
- Can we use explanations to improve hate speech models? Our paper accepted at AAAI 2021 tries to explore that question.β208Updated 2 years ago
- NL-Augmenter π¦ β π A Collaborative Repository of Natural Language Transformationsβ785Updated last year
- BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)β593Updated 11 months ago
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherβ¦β1,233Updated 4 months ago
- This repo contains the code for generating the ToxiGen dataset, published at ACL 2022.β321Updated last year
- Hate speech dataset from Stormfront forum manually labelled at sentence level.β173Updated 5 years ago
- Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017β816Updated 2 years ago
- A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/caβ¦β485Updated last year
- OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)β771Updated 11 months ago
- Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining theβ¦β2,040Updated 10 months ago
- Resources for the "SummEval: Re-evaluating Summarization Evaluation" paperβ394Updated last year
- BLEURT is a metric for Natural Language Generation based on transfer learning.β739Updated last year
- Efficient few-shot learning with Sentence Transformersβ2,512Updated 2 months ago
- π§Ή Python package for text cleaningβ978Updated 2 years ago
- Compute Sentence Embeddings Fast!β623Updated 2 years ago
- BERT score for text generationβ1,759Updated 11 months ago
- BookNLP, a natural language processing pipeline for booksβ850Updated 11 months ago
- Collection of papers and resources for data augmentation for NLP.β827Updated 2 years ago
- Top2Vec learns jointly embedded topic, document and word vectors.β3,060Updated 7 months ago
- Model explainability that works seamlessly with π€ transformers. Explain your transformers model in just 2 lines of code.β1,355Updated last year
- The world's largest social media toxicity dataset.β181Updated 3 years ago
- TextAugment: Text Augmentation Libraryβ422Updated last year
- Tools to download and cleanup Common Crawl dataβ1,016Updated 2 years ago
- A Python framework for performing information retrieval experiments, building on http://terrier.org/β462Updated 2 weeks ago
- Beyond Accuracy: Behavioral Testing of NLP models with CheckListβ2,036Updated last year
- Open-Source Information Retrieval Courses @ TU Wienβ614Updated 2 years ago
- Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Fβ¦β573Updated last year
- TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/undβ¦β350Updated 2 months ago