hendrycks / ethicsLinks
Aligning AI With Shared Human Values (ICLR 2021)
☆297Updated 2 years ago
Alternatives and similar repositories for ethics
Users that are interested in ethics are comparing it to the libraries listed below
Sorting:
- Repository for research in the field of Responsible NLP at Meta.☆202Updated 4 months ago
- ☆114Updated last year
- Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper☆80Updated 4 years ago
- Repository for the Bias Benchmark for QA dataset.☆128Updated last year
- Package to compute Mauve, a similarity score between neural text and human text. Install with `pip install mauve-text`.☆297Updated last year
- StereoSet: Measuring stereotypical bias in pretrained language models☆190Updated 2 years ago
- This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".☆88Updated 4 years ago
- PAIR.withgoogle.com and friend's work on interpretability methods☆202Updated last week
- ☆217Updated 4 years ago
- Few-shot Learning of GPT-3☆355Updated 2 years ago
- The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.☆179Updated 3 years ago
- Inspecting and Editing Knowledge Representations in Language Models☆116Updated 2 years ago
- The Prism Alignment Project☆79Updated last year
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆92Updated last year