skywalker023 / fantom
👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"
☆51Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for fantom
- ☆48Updated last year
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆28Updated 4 months ago
- [EMNLP Findings 2024 & ACL 2024 NLRSE Oral] Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards☆44Updated 6 months ago
- ☆26Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆41Updated 9 months ago
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages☆36Updated last month
- Inspecting and Editing Knowledge Representations in Language Models☆107Updated last year
- Critique-out-Loud Reward Models☆36Updated 3 weeks ago
- ☆26Updated last year
- ☆26Updated 7 months ago
- SILO Language Models code repository☆80Updated 8 months ago
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆96Updated last year
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆62Updated last year
- [EMNLP 2022] TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models☆66Updated 5 months ago
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆27Updated last month
- ☆23Updated 11 months ago
- [ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision☆78Updated last week
- ☆11Updated 2 years ago
- [EMNLP 2023, Findings] GRACE: Discriminator-Guided Chain-of-Thought Reasoning☆44Updated 3 weeks ago
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆111Updated last month
- Repository for "I am a Strange Dataset: Metalinguistic Tests for Language Models"☆39Updated 9 months ago
- [EMNLP 2024] Official implementation of "Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Ut…☆18Updated 3 weeks ago
- About The corresponding code from our paper " REFINER: Reasoning Feedback on Intermediate Representations" (EACL 2024). Do not hesitate t…☆66Updated 8 months ago
- [ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.☆92Updated last year
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆97Updated last year
- ☆44Updated 2 months ago
- DEMix Layers for Modular Language Modeling☆53Updated 3 years ago
- Supporting code for ReCEval paper☆26Updated last month
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆55Updated last year
- ☆34Updated 3 months ago