skywalker023 / fantomLinks
๐ป Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"
โ55Updated last year
Alternatives and similar repositories for fantom
Users that are interested in fantom are comparing it to the libraries listed below
Sorting:
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our focโฆโ32Updated 11 months ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Modelโ43Updated last year
- โ28Updated last year
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messagesโ47Updated 5 months ago
- [๐๐๐๐๐ ๐ ๐ข๐ง๐๐ข๐ง๐ ๐ฌ ๐๐๐๐ & ๐๐๐ ๐๐๐๐ ๐๐๐๐๐ ๐๐ซ๐๐ฅ] ๐๐ฏ๐ฉ๐ข๐ฏ๐ค๐ช๐ฏ๐จ ๐๐ข๐ต๐ฉ๐ฆ๐ฎ๐ข๐ต๐ช๐ค๐ข๐ญ ๐๐ฆ๐ข๐ด๐ฐ๐ฏ๐ช๐ฏโฆโ51Updated last year
- โ21Updated last year
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learnersโ116Updated 8 months ago
- [arXiv preprint] Official Repository for "Evaluating Language Models as Synthetic Data Generators"โ33Updated 5 months ago
- Inspecting and Editing Knowledge Representations in Language Modelsโ116Updated last year
- โ54Updated 2 weeks ago
- โ10Updated 8 months ago
- SILO Language Models code repositoryโ81Updated last year
- โ44Updated 9 months ago
- โ31Updated last year
- โ24Updated last year
- Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"โ12Updated 2 months ago
- CausalGym: Benchmarking causal interpretability methods on linguistic tasksโ43Updated 6 months ago
- โ21Updated 2 years ago
- โ20Updated last month
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuningโ98Updated 2 years ago
- Benchmarking Generalization to New Tasks from Natural Language Instructionsโ26Updated 3 years ago
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalizationโ28Updated 8 months ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.โ40Updated last year
- Code and data for paper "Context-faithful Prompting for Large Language Models".โ40Updated 2 years ago
- โ44Updated 9 months ago
- โ50Updated last year
- Discriminator-Guided Chain-of-Thought Reasoningโ47Updated 7 months ago
- This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entityโฆโ25Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"โ54Updated last year
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Trainingโ21Updated 9 months ago