myracheng / markedpersonas
Code and data for Marked Personas (ACL 2023)
☆23Updated last year
Alternatives and similar repositories for markedpersonas:
Users that are interested in markedpersonas are comparing it to the libraries listed below
- Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper☆77Updated 4 years ago
- Repository for the Bias Benchmark for QA dataset.☆105Updated last year
- UnQovering Stereotyping Biases via Underspecified Questions - EMNLP 2020 (Findings)☆21Updated 3 years ago
- ☆128Updated last year
- Code and data for Koo et al's ACL 2024 paper "Benchmarking Cognitive Biases in Large Language Models as Evaluators"☆19Updated last year
- ☆104Updated 10 months ago
- ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.☆134Updated 3 months ago
- This repository contains the data and code introduced in the paper "CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Maske…☆115Updated last year
- ☆57Updated 4 months ago
- [ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models☆80Updated 6 months ago
- ☆16Updated last year
- ☆47Updated last year
- EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975☆37Updated last year
- ☆25Updated 2 years ago
- The implementation of "RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question" [ACL 2023]☆15Updated 11 months ago
- Resources for cultural NLP research☆86Updated 2 months ago
- ☆25Updated last year
- BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages☆28Updated 3 months ago
- ☆22Updated last year
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆58Updated last year
- Can Large Language Models Be an Alternative to Human Evaluations?☆9Updated last year
- ☆25Updated 6 months ago
- ☆19Updated 3 months ago
- The official repo for SocKET: Social Knowledge Evaluation Tests☆23Updated last year
- Codebase, data and models for the SummaC paper in TACL☆89Updated 2 months ago
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/☆21Updated 3 weeks ago
- ☆68Updated 3 months ago
- ☆36Updated last year
- Code and test data for "On Measuring Bias in Sentence Encoders", to appear at NAACL 2019.☆54Updated 3 years ago
- Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge☆13Updated last year