llm-as-a-judge / Awesome-LLM-as-a-judgeLinks

☆388

Alternatives and similar repositories for Awesome-LLM-as-a-judge

Users that are interested in Awesome-LLM-as-a-judge are comparing it to the libraries listed below

Sorting:

Zhen-Tan-dmml / LLM4Annotation
☆591Updated last week
LuckyyySTA / Awesome-LLM-hallucination
LLM hallucination paper list
☆320Updated last year
quchangle1 / LLM-Tool-Survey
This is the repository for the Tool Learning survey.
☆418Updated 2 months ago
alon-albalak / data-selection-survey
A Survey on Data Selection for Language Models
☆245Updated 3 months ago
xinzhel / LLM-Agent-Survey
Survey on LLM Agents (Published on CoLing 2025)
☆358Updated 3 months ago
Eclipsess / Awesome-Efficient-Reasoning-LLMs
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
☆547Updated this week
voidism / DoLa
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
☆504Updated 6 months ago
RUCAIBox / Slow_Thinking_with_LLMs
A series of technical report on Slow Thinking with LLM
☆713Updated last month
wangcunxiang / LLM-Factuality-Survey
The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>
☆341Updated last year
DataArcTech / LLM-as-a-Judge
☆128Updated 4 months ago
cooperleong00 / Awesome-LLM-Interpretability
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
☆262Updated 4 months ago
princeton-nlp / LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
☆474Updated 9 months ago
teacherpeterpan / self-correction-llm-papers
This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.
☆542Updated 9 months ago
zchuz / CoT-Reasoning-Survey
[ACL 2024] A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
☆457Updated 6 months ago
ZigeW / data_management_LLM
Collection of training data management explorations for large language models
☆329Updated last year
XiaoYee / Awesome_Efficient_LRM_Reasoning
😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
☆277Updated last month
AmourWaltz / Reliable-LLM
☆151Updated 10 months ago
LightChen233 / Awesome-Long-Chain-of-Thought-Reasoning
Latest Advances on Long Chain-of-Thought Reasoning
☆459Updated 2 weeks ago
pillowsofwind / Knowledge-Conflicts-Survey
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
☆129Updated 10 months ago
oneal2000 / PRAG
Code for Parametric RAG, SIGIR 2025 Full Paper
☆182Updated 3 months ago
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆234Updated 2 months ago
RUCAIBox / HaluEval
This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.
☆497Updated last year
HITsz-TMG / awesome-llm-attributions
A Survey of Attributions for Large Language Models
☆207Updated 11 months ago
GAIR-NLP / cognition-engineering
Generative AI Act II: Test Time Scaling Drives Cognition Engineering
☆198Updated 3 months ago
MozerWang / Loong
[EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
☆139Updated 8 months ago
pengr / LLM-Synthetic-Data
A live reading list for LLM-synthetic-data.
☆343Updated last week
bruno686 / Awesome-Agent-Training
Awesome Agent Training
☆204Updated last week
ZubinGou / math-evaluation-harness
A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨
☆239Updated last year
liunian-Jay / Awesome-RAG
An up-to-date curated list of Retrieval-Augmented Generation (RAG) for LLMs.
☆130Updated this week
jlko / semantic_uncertainty
Codebase for reproducing the experiments of the semantic uncertainty paper (short-phrase and sentence-length experiments).
☆349Updated last year