alisawuffles / ambient
Code and data associated with the AmbiEnt dataset in "We're Afraid Language Models Aren't Modeling Ambiguity" (Liu et al., 2023)
☆51Updated 7 months ago
Related projects: ⓘ
- ☆64Updated 7 months ago
- [ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.☆91Updated last year
- Inspecting and Editing Knowledge Representations in Language Models☆105Updated last year
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models"☆54Updated 8 months ago
- ☆80Updated last year
- The data and the PyTorch implementation for the models and experiments in the paper "Exploiting Asymmetry for Synthetic Training Data Gen…☆56Updated last year
- Supporting code for ReCEval paper☆26Updated this week
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆37Updated last year
- ☆70Updated 10 months ago
- Code and data for the FACTOR paper☆36Updated 10 months ago
- ☆32Updated 5 months ago
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆96Updated last week
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆44Updated 9 months ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆38Updated 9 months ago
- ☆46Updated 10 months ago
- ☆42Updated 7 months ago
- ☆23Updated last year
- This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”☆83Updated 2 years ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆51Updated 5 months ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆73Updated 5 months ago
- ☆57Updated 2 years ago
- Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"☆61Updated 2 years ago
- ☆36Updated 5 months ago
- TBC☆26Updated last year
- Retrieval as Attention☆77Updated last year
- ☆77Updated last year
- ☆39Updated 9 months ago
- Code for Editing Factual Knowledge in Language Models☆134Updated 2 years ago
- A unified benchmark for math reasoning☆87Updated last year
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆37Updated 2 months ago