dmis-lab / ETHICLinks
[NAACL 2025] ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
β16Updated 5 months ago
Alternatives and similar repositories for ETHIC
Users that are interested in ETHIC are comparing it to the libraries listed below
Sorting:
- π² Code for our EMNLP 2023 paper - π "Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Modeβ¦β54Updated 2 years ago
- [ICLR 2025] ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domainsβ17Updated 11 months ago
- [ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervisionβ95Updated last year
- [EMNLP 2024] This is the code for our paper "BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers".β23Updated last year
- official repository for ListT5β48Updated 2 months ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"β81Updated last year
- β47Updated 4 months ago
- ACL 2023: Evaluating Open-Domain Question Answering in the Era of Large Language Modelsβ47Updated 2 years ago
- First explanation metric (diagnostic report) for text generation evaluationβ62Updated 11 months ago
- Official codebase for permutation self-consistency.β18Updated last year
- [EMNLP 2024] CompAct: Compressing Retrieved Documents Actively for Question Answeringβ38Updated last year
- β91Updated last year
- Merging Generated and Retrieved Knowledge for Open-Domain QA (EMNLP 2023)β22Updated 2 years ago
- β78Updated last year
- β19Updated last year
- β17Updated 2 years ago
- β22Updated 2 years ago
- [NeurIPS 2025] Reasoning Models Better Express Their Confidence"β22Updated 2 months ago
- Personalized Story Evaluation Modelβ18Updated 2 years ago
- β75Updated last year
- β41Updated last year
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our focβ¦β32Updated last year
- Enhancing contextual understanding in large language models through contrastive decodingβ20Updated last year
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questionsβ119Updated last year
- [EMNLP 2022] TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Modelsβ74Updated last year
- A comprehensive paper list of Reasoning over Tables.β30Updated 3 years ago
- β32Updated 2 years ago
- Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"β12Updated 10 months ago
- Official Code Repository for the paper "Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasksβ¦β42Updated last year
- Codes for Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback (ACL 2024 Findings)β16Updated last year