dependentsign / Awesome-LLM-based-EvaluatorsLinks

✨✨Latest Papers about LLM-based Evaluators

☆30

Alternatives and similar repositories for Awesome-LLM-based-Evaluators

Users that are interested in Awesome-LLM-based-Evaluators are comparing it to the libraries listed below

Sorting:

ParticleMedia / RAGTruth
Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"
☆192Updated 7 months ago
RUCAIBox / HaluEval
This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.
☆493Updated last year
llm-as-a-judge / Awesome-LLM-as-a-judge
☆384Updated last month
alon-albalak / data-selection-survey
A Survey on Data Selection for Language Models
☆243Updated 2 months ago
wangcunxiang / LLM-Factuality-Survey
The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>
☆341Updated last year
LuckyyySTA / Awesome-LLM-hallucination
LLM hallucination paper list
☆319Updated last year
AI21Labs / in-context-ralm
☆284Updated last year
shmsw25 / FActScore
A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…
☆363Updated 3 months ago
princeton-nlp / ALCE
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627
☆489Updated 9 months ago
teacherpeterpan / self-correction-llm-papers
This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.
☆538Updated 8 months ago
hyintell / awesome-refreshing-llms
EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.
☆134Updated last year
voidism / DoLa
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
☆503Updated 6 months ago
Libr-AI / do-not-answer
Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
☆263Updated last year
nlpyang / geval
Code for paper "G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment"
☆365Updated last year
HITsz-TMG / awesome-llm-attributions
A Survey of Attributions for Large Language Models
☆206Updated 11 months ago
carriex / recomp
RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation.
☆134Updated 2 months ago
nelson-liu / lost-in-the-middle
Code and data for "Lost in the Middle: How Language Models Use Long Contexts"
☆351Updated last year
RUCAIBox / Language-Specific-Neurons
☆76Updated 7 months ago
chongyangtao / LLMs-for-NLG-Evaluation
Awesome LLM for NLG Evaluation Papers
☆24Updated last year
zorazrw / awesome-tool-llm
☆237Updated 11 months ago
shizhediao / active-prompt
Source code for the paper "Active Prompting with Chain-of-Thought for Large Language Models"
☆243Updated last year
glgh / awesome-llm-human-preference-datasets
A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.
☆369Updated last year
DataArcTech / LLM-as-a-Judge
☆123Updated 4 months ago
TIGER-AI-Lab / Program-of-Thoughts
Data and Code for Program of Thoughts [TMLR 2023]
☆279Updated last year
weizhepei / InstructRAG
[ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
☆109Updated 5 months ago
cooperleong00 / Awesome-LLM-Interpretability
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
☆259Updated 4 months ago
FreedomIntelligence / ReasoningNLP
paper list on reasoning in NLP
☆190Updated 3 months ago
SuperBruceJia / Awesome-LLM-Self-Consistency
Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models
☆102Updated 11 months ago
Sahandfer / PersonaPaper
This is a repository for sharing papers in the field of persona-based conversational AI. The related source code for each paper is linked…
☆163Updated last year
OSU-NLP-Group / LLM-Knowledge-Conflict
[ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"
☆70Updated last year