dmg-illc / JUDGE-BENCHLinks
☆27Updated last month
Alternatives and similar repositories for JUDGE-BENCH
Users that are interested in JUDGE-BENCH are comparing it to the libraries listed below
Sorting:
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/☆24Updated 3 months ago
- ☆35Updated 3 years ago
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆41Updated last year
- Code for "Tracing Knowledge in Language Models Back to the Training Data"☆38Updated 2 years ago
- ☆48Updated last year
- Interpreting Language Models with Contrastive Explanations (EMNLP 2022 Best Paper Honorable Mention)☆62Updated 3 years ago
- ☆38Updated last year
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Updated 10 months ago
- Exploring the Limitations of Large Language Models on Multi-Hop Queries☆25Updated 3 months ago
- ☆29Updated 11 months ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆44Updated 11 months ago
- Materials for "Prompting is not a substitute for probability measurements in large language models" (EMNLP 2023)☆24Updated last year
- A curated list of research papers and resources on Cultural LLM.☆44Updated 9 months ago
- Code repository for the paper "Mission: Impossible Language Models."☆52Updated last month
- 🌾 Universal, customizable and deployable fine-grained evaluation for text generation.☆23Updated last year
- ☆22Updated 2 years ago
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆58Updated last year
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆85Updated 10 months ago
- ☆28Updated last year
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"☆28Updated 3 years ago
- The geometry of multilingual language model representations (EMNLP 2022).☆21Updated 2 years ago
- Code for preprint: Summarizing Differences between Text Distributions with Natural Language☆42Updated 2 years ago
- ☆44Updated last year
- Apps built using Inspired Cognition's Critique.☆58Updated 2 years ago