valentinhofmann / dialect-prejudice
☆35Updated 7 months ago
Alternatives and similar repositories for dialect-prejudice
Users that are interested in dialect-prejudice are comparing it to the libraries listed below
Sorting:
- Repository for the Bias Benchmark for QA dataset.☆115Updated last year
- ☆24Updated 2 years ago
- Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper☆80Updated 4 years ago
- ☆132Updated last year
- The Prism Alignment Project☆75Updated last year
- ☆40Updated 5 months ago
- ☆106Updated last year
- Detecting Bias and ensuring Fairness in AI solutions☆91Updated 2 years ago
- ☆94Updated 3 months ago
- Fairness toolkit for pytorch, scikit learn and autogluon☆32Updated 5 months ago
- LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces☆92Updated last year
- This repository contains two datasets with multi-turn adversarial conversations generated by human agents interacting with a dialog model…☆26Updated 10 months ago
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆97Updated 2 months ago
- ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.☆136Updated 5 months ago
- ☆24Updated 6 months ago
- Evaluating the Moral Beliefs Encoded in LLMs☆26Updated 5 months ago
- Code/data for MARG (multi-agent review generation)☆43Updated 6 months ago
- ☆50Updated last year
- A resource repository for representation engineering in large language models☆120Updated 6 months ago
- ☆233Updated last month
- Repository for research in the field of Responsible NLP at Meta.☆199Updated 5 months ago
- Steering Llama 2 with Contrastive Activation Addition☆151Updated 11 months ago
- ☆22Updated last year
- Paper list for the survey "Combating Misinformation in the Age of LLMs: Opportunities and Challenges" and the initiative "LLMs Meet Misin…☆103Updated 6 months ago
- The dataset and code for the ICLR 2024 paper "Can LLM-Generated Misinformation Be Detected?"☆64Updated 6 months ago
- This repository contains the data and code introduced in the paper "CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Maske…☆118Updated last year
- Medical Hallucination in Foundation Models and Their Impact on Healthcare (2025)☆53Updated 2 months ago
- ☆30Updated 6 months ago
- ☆14Updated 2 months ago
- 👩💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"☆20Updated last year