1171-jpg / BrainTeaser
☆12Updated last year
Alternatives and similar repositories for BrainTeaser:
Users that are interested in BrainTeaser are comparing it to the libraries listed below
- ☆12Updated last year
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆59Updated last year
- Augmenting Statistical Models with Natural Language Parameters☆23Updated 5 months ago
- Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.☆135Updated 4 months ago
- 🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts☆37Updated 4 months ago
- ☆43Updated 2 years ago
- [NAACL'25] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆47Updated 2 months ago
- Resources for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach. EMNLP 2022.☆21Updated 2 years ago
- This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".☆27Updated 6 months ago
- Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".☆75Updated 3 months ago
- Github repo for MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning☆14Updated 8 months ago
- A benchmark dataset for evaluating dialog system and natural language generation metrics.☆36Updated 2 years ago
- ☆71Updated last year
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated last year
- Automatic metrics for GEM tasks☆64Updated 2 years ago
- ☆26Updated 9 months ago
- Supporting code for ReCEval paper☆28Updated 5 months ago
- ☆22Updated last week
- Entity-Based Knowledge Conflicts in Question Answering. Code repo for EMNLP2021 paper: https://aclanthology.org/2021.emnlp-main.565/☆72Updated 2 years ago
- [arXiv preprint] Official Repository for "Evaluating Language Models as Synthetic Data Generators"☆34Updated 2 months ago
- ☆19Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/☆21Updated 2 months ago
- Text generation using language models with multiple exit heads☆15Updated 2 weeks ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆42Updated 7 months ago
- ☆25Updated 11 months ago
- ☆82Updated 5 months ago
- Benchmarking Generalization to New Tasks from Natural Language Instructions☆26Updated 3 years ago
- ☆19Updated last year
- Critique-out-Loud Reward Models☆52Updated 4 months ago