mitmedialab / nmi-ai-2023Links
A repository for the paper "Beliefs about AI influence human-AI interaction and can be manipulated to increase perceived trustworthiness, empathy, and effectiveness" Nature Machine Intelligence 2023.
☆17Updated last year
Alternatives and similar repositories for nmi-ai-2023
Users that are interested in nmi-ai-2023 are comparing it to the libraries listed below
Sorting:
- Repo for: When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment☆38Updated 2 years ago
- ☆23Updated last year
- ☆95Updated last year
- DialOp: Decision-oriented dialogue environments for collaborative language agents☆106Updated 6 months ago
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆55Updated last year
- Tasks for describing differences between text distributions.☆16Updated 9 months ago
- ☆106Updated last year
- ☆12Updated 2 years ago
- personalized-llms with allen institute☆15Updated last year
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆61Updated 2 years ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆76Updated last year
- Codes and Datasets for our ACL 2023 paper on cognitive reframing of negative thoughts☆62Updated last year
- EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975☆37Updated last year
- code and data associated with CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations☆10Updated last year
- Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique☆16Updated 9 months ago
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆35Updated 7 months ago
- ☆35Updated 2 years ago
- Evaluating the Moral Beliefs Encoded in LLMs☆26Updated 5 months ago
- Data Valuation on In-Context Examples (ACL23)☆23Updated 4 months ago
- Data and code for the paper "Inducing Positive Perspectives with Text Reframing"☆61Updated last year
- The KiloGram Tangrams dataset☆55Updated last month
- This repository contains data, code and models for contextual noncompliance.☆22Updated 10 months ago
- This repository contains the official code for the paper: "Prompt Injection: Parameterization of Fixed Inputs"☆32Updated 8 months ago
- ✨ Resolving Knowledge Conflicts in Large Language Models, COLM 2024☆17Updated last week
- This repo contains code for our NeurIPS 2023 spotlight paper: Evaluating and Inducing Personality in Pre-trained Language Models☆52Updated last year
- ☆24Updated 9 months ago
- ☆36Updated 2 years ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆30Updated 4 months ago
- Restore safety in fine-tuned language models through task arithmetic☆28Updated last year
- ☆78Updated 2 years ago