weiyifan1023 / MenatQALinks
Code and Data for EMNLP 2023 Paper "MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models"
β14Updated 9 months ago
Alternatives and similar repositories for MenatQA
Users that are interested in MenatQA are comparing it to the libraries listed below
Sorting:
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"β81Updated last year
- π² Code for our EMNLP 2023 paper - π "Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Modeβ¦β53Updated 2 years ago
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don'tβ¦β129Updated last year
- β78Updated last year
- Enhancing contextual understanding in large language models through contrastive decodingβ20Updated last year
- β89Updated last year
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questionsβ118Updated last year
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"β151Updated last year
- β22Updated last year
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)β127Updated last year
- Code and data for "ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM" (NeurIPS 2024 Track Datasets andβ¦β64Updated 8 months ago
- β187Updated 6 months ago
- β48Updated 2 years ago
- The repository for paper <Evaluating Open-QA Evaluation>β25Updated last year
- [ICLR 2025] BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrievalβ189Updated 4 months ago
- Official codebase for permutation self-consistency.β18Updated last year
- Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)β58Updated 4 months ago
- [ACL 2023] Code and Data Repo for Paper "Element-aware Summary and Summary Chain-of-Thought (SumCoT)"β53Updated 2 years ago
- Source code of our paper MIND, ACL 2024 Long Paperβ60Updated 2 months ago
- Source code for Truth-Aware Context Selection: Mitigating the Hallucinations of Large Language Models Being Misled by Untruthful Contextsβ17Updated last year
- Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Zihaβ¦β132Updated last year
- AbstainQA, ACL 2024β28Updated last year
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)β63Updated 2 years ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QAβ146Updated last month
- Repository for Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions, ACL23β248Updated last year
- β58Updated last year
- Implementation of the paper: "Making Retrieval-Augmented Language Models Robust to Irrelevant Context"β75Updated last year
- β294Updated 2 years ago
- Code and data for "The Power of Noise: Redefining Retrieval for RAG Systems"β71Updated 6 months ago
- The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models"β34Updated last year