unitaryai / detoxify
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using β‘ Pytorch Lightning and π€ Transformers. For access to our API, please email us at contact@unitary.ai.
β927Updated last month
Related projects: β
- TextAttack π is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocsβ¦β2,899Updated last month
- NL-Augmenter π¦ β π A Collaborative Repository of Natural Language Transformationsβ770Updated 4 months ago
- Catalog of abusive language data (PLoS 2020)β299Updated 3 months ago
- Model explainability that works seamlessly with π€ transformers. Explain your transformers model in just 2 lines of code.β1,266Updated last year
- Repository for TweetEvalβ354Updated 2 years ago
- BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)β574Updated last month
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.β1,552Updated last month
- Beyond Accuracy: Behavioral Testing of NLP models with CheckListβ1,998Updated 8 months ago
- This repo contains the code for generating the ToxiGen dataset, published at ACL 2022.β270Updated 3 months ago
- BLEURT is a metric for Natural Language Generation based on transfer learning.β685Updated last year
- Efficient few-shot learning with Sentence Transformersβ2,143Updated this week
- OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)β714Updated last month
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherβ¦β1,192Updated 8 months ago
- This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"β1,619Updated last year
- β1,100Updated last month
- Can we use explanations to improve hate speech models? Our paper accepted at AAAI 2021 tries to explore that question.β186Updated last year
- Multimodal model for text and tabular data with HuggingFace transformers as building block for text dataβ578Updated last week
- skweak: A software toolkit for weak supervision applied to NLP tasksβ918Updated 2 weeks ago
- TweetNLP for all the NLP enthusiasts working on Twitter! The Python library tweetnlp provides a collection of useful tools to analyze/undβ¦β304Updated last month
- The implementation of DeBERTaβ1,966Updated 11 months ago
- BookNLP, a natural language processing pipeline for booksβ782Updated last month
- A library that incorporates state-of-the-art explainers for text-based machine learning models and visualizes the result with a built-in β¦β413Updated 7 months ago
- Robustness Gym is an evaluation toolkit for machine learning.β439Updated 2 years ago
- Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining theβ¦β1,965Updated last month
- A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/caβ¦β475Updated 9 months ago
- Tools to download and cleanup Common Crawl dataβ961Updated last year
- Minimal keyword extraction with BERTβ3,441Updated 2 months ago
- TextAugment: Text Augmentation Libraryβ394Updated 7 months ago
- SGPT: GPT Sentence Embeddings for Semantic Searchβ838Updated 7 months ago
- β‘ boost inference speed of T5 models by 5x & reduce the model size by 3x.β561Updated last year