huashen218 / bidirectional-alignment-reading-list
The Survey Paper of "Bidirectional Human-AI Alignment"
☆15Updated 5 months ago
Alternatives and similar repositories for bidirectional-alignment-reading-list:
Users that are interested in bidirectional-alignment-reading-list are comparing it to the libraries listed below
- ☆153Updated 2 months ago
- ☆87Updated 2 years ago
- Code for preprint: Summarizing Differences between Text Distributions with Natural Language☆42Updated last year
- ☆96Updated 2 years ago
- ☆49Updated last year
- Distributional Generalization in NLP. A roadmap.☆87Updated 2 years ago
- ☆153Updated 7 months ago
- ☆22Updated 10 months ago
- Code for the paper "Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias"☆74Updated 3 years ago
- [ACL 2020] Towards Debiasing Sentence Representations☆64Updated 2 years ago
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆101Updated last year
- ☆21Updated 3 months ago
- FeedbackQA: Improving Question Answering Post-Deployment with Interactive Feedback☆11Updated 2 years ago
- ☆46Updated last year
- ☆100Updated 8 months ago
- tianlu-wang / Identifying-and-Mitigating-Spurious-Correlations-for-Improving-Robustness-in-NLP-ModelsNAACL 2022 Findings☆15Updated 2 years ago
- Teaching Models to Express Their Uncertainty in Words☆36Updated 2 years ago
- The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning (NeurIPS 2022)☆14Updated last year
- ☆41Updated last year
- ☆28Updated last year
- ☆77Updated 2 years ago
- ☆33Updated 3 years ago
- ☆26Updated 2 years ago
- ☆83Updated 7 months ago
- Implementation for https://arxiv.org/abs/2005.00652☆28Updated 2 years ago
- Independent implementation of DBCA method from http://arxiv.org/abs/1912.09713☆11Updated 4 years ago
- Materials for EACL2024 tutorial: Transformer-specific Interpretability☆44Updated 10 months ago
- Conformal Language Modeling☆28Updated last year
- [ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models☆60Updated 2 years ago
- CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior☆12Updated 2 years ago