☆155Aug 9, 2022Updated 3 years ago
Alternatives and similar repositories for moderation-api-release
Users that are interested in moderation-api-release are comparing it to the libraries listed below
Sorting:
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆128Feb 24, 2025Updated last year
- Benchmark dataset for the evaluation of scientific article representations on the task of citation recommendation across various scientif…☆12Oct 21, 2022Updated 3 years ago
- ☆31Feb 12, 2026Updated 2 weeks ago
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.☆120Feb 18, 2026Updated last week
- Websockify is a WebSocket to TCP proxy/bridge. This allows a browser to connect to any application/server/service. Implementations in Py…☆28Nov 7, 2016Updated 9 years ago
- A Test Collection of Computer Science Papers for Faceted Query by Example☆22Nov 28, 2021Updated 4 years ago
- BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).☆175Oct 27, 2023Updated 2 years ago
- Code for the paper "Batch size invariance for policy optimization"☆56Apr 2, 2023Updated 2 years ago
- Code for the paper "Modeling Information Change in Science Communication with Semantically Matched Paraphrases" from EMNLP 2022☆13Oct 20, 2022Updated 3 years ago
- ☆10Oct 31, 2022Updated 3 years ago
- ☆27Nov 20, 2023Updated 2 years ago
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"☆1,816Jun 17, 2025Updated 8 months ago
- Service for quickly aliasing and redirecting to long URLs☆24Apr 26, 2023Updated 2 years ago