surge-ai / toxicity
The world's largest social media toxicity dataset.
☆178Updated 2 years ago
Alternatives and similar repositories for toxicity:
Users that are interested in toxicity are comparing it to the libraries listed below
- The AI Knowledge Editor☆182Updated 2 years ago
- A Python library for calculating a large variety of metrics from text☆324Updated last month
- Neural Search☆326Updated 7 months ago
- Conversational text Analysis using various NLP techniques☆179Updated last year
- Find and fix bugs in natural language machine learning models using adaptive testing.☆181Updated 8 months ago
- The world's largest profanity list.☆213Updated 9 months ago
- A library to synthesize text datasets using Large Language Models (LLM)☆151Updated 2 years ago
- A multilingual lexicon of words to hurt.☆82Updated 2 months ago
- Labelling platform for text using weak supervision.☆260Updated 2 years ago
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆65Updated last year
- A python package for benchmarking interpretability techniques on Transformers.☆213Updated 4 months ago
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆242Updated last year
- 🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy☆301Updated last year
- NUBIA (NeUral Based Interchangeability Assessor) is a new SoTA evaluation metric for text generation☆53Updated last year
- Weakly Supervised End-to-End Learning (NeurIPS 2021)☆157Updated last year
- Open source library for few shot NLP☆77Updated last year
- Code repository for the NAACL 2022 paper "ExSum: From Local Explanations to Model Understanding"☆64Updated 2 years ago
- The Python library with command line tools to interact with Dynabench(https://dynabench.org/), such as uploading models.☆55Updated 2 years ago
- Question-answers, collected from Google☆125Updated 3 years ago
- negate_sentence(A Python module that doesn't negate sentences.)☆28Updated 3 months ago
- Creating class-based TF-IDF matrices☆82Updated 2 years ago
- Granular Viewer of Sentiments Between Entities in Massively Large Documents and Collections of Texts, powered by AREkit☆37Updated last week
- This repository contains a dataset for hate speech detection on social media platforms.☆70Updated 2 years ago
- Aligning AI With Shared Human Values (ICLR 2021)☆267Updated last year
- XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale☆154Updated last year
- Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Tagging☆65Updated 2 years ago
- Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆111Updated 2 months ago
- An on-going dataset consisting of hashtags, n-gram counts and other misc NLP things for covid-19 analysis, stemming from over 100 000 000…☆57Updated 2 years ago
- A word2vec negative sampling implementation with correct CBOW update.☆261Updated 3 years ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆244Updated last year