openai / moderation-api-releaseLinks
☆142Updated 3 years ago
Alternatives and similar repositories for moderation-api-release
Users that are interested in moderation-api-release are comparing it to the libraries listed below
Sorting:
- ☆220Updated 4 years ago
- ☆241Updated 2 years ago
- This repo contains the code for generating the ToxiGen dataset, published at ACL 2022.☆331Updated last year
- ☆115Updated last year
- Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs☆91Updated 10 months ago
- A Comprehensive Assessment of Trustworthiness in GPT Models☆303Updated last year
- This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Aji…☆233Updated last year
- Repository for research in the field of Responsible NLP at Meta.☆202Updated 4 months ago
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.☆105Updated this week
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆107Updated 2 years ago
- Inspecting and Editing Knowledge Representations in Language Models☆116Updated 2 years ago
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆116Updated 7 months ago
- Repository for the Bias Benchmark for QA dataset.