dependentsign / Awesome-LLM-based-Evaluators
✨✨Latest Papers about LLM-based Evaluators
☆28Updated last year
Alternatives and similar repositories for Awesome-LLM-based-Evaluators:
Users that are interested in Awesome-LLM-based-Evaluators are comparing it to the libraries listed below
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆170Updated 4 months ago
- ☆53Updated 8 months ago
- ☆72Updated 4 months ago
- A Survey on Data Selection for Language Models☆225Updated 6 months ago
- ☆278Updated last year
- Awesome LLM for NLG Evaluation Papers☆23Updated last year
- EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.☆133Updated last year
- ☆174Updated 2 years ago
- ☆71Updated last year
- LLM hallucination paper list☆314Updated last year
- 🌲 Code for our EMNLP 2023 paper - 🎄 "Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Mode…☆49Updated last year
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆112Updated 7 months ago
- Source Code of Paper "GPTScore: Evaluate as You Desire"☆246Updated 2 years ago
- ACL2023 - AlignScore, a metric for factual consistency evaluation.☆127Updated last year
- Multilingual Large Language Models Evaluation Benchmark☆123Updated 8 months ago
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆110Updated 9 months ago
- Code Repo for EfficientRAG: Efficient Retriever for Multi-Hop Question Answering☆45Updated last month
- ☆313Updated 3 weeks ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆67Updated last year
- The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>☆337Updated last year
- Token-level Reference-free Hallucination Detection☆94Updated last year
- ☆97Updated last month
- Scaling Sentence Embeddings with Large Language Models☆106Updated last year
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)☆80Updated 2 months ago
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆109Updated 7 months ago
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆58Updated last year
- A curated list of awesome papers about information retrieval(IR) in the age of large language model(LLM). These include retrieval augment…☆63Updated 8 months ago
- ☆18Updated last year
- RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation.☆126Updated last week
- Code for Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks☆55Updated last year