DataArcTech/LLM-as-a-Judge

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/DataArcTech/LLM-as-a-Judge)

DataArcTech / LLM-as-a-Judge

☆171

Alternatives and similar repositories for LLM-as-a-Judge

Users that are interested in LLM-as-a-Judge are comparing it to the libraries listed below

Sorting:

lblankl / Short-RL
View on GitHub
Short RL
☆18May 26, 2025Updated 9 months ago
ScalerLab / JudgeBench
View on GitHub
☆112Nov 7, 2024Updated last year
llm-as-a-judge / Awesome-LLM-as-a-judge
View on GitHub
☆530Jul 25, 2025Updated 7 months ago
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated last year
LAMDASZ-ML / Self-Backtracking
View on GitHub
☆52Feb 12, 2025Updated last year
jiangycTarheel-zz / NMN-MultiHopQA
View on GitHub
☆20Jun 7, 2020Updated 5 years ago
chenzhongwu20 / RuleRAG_ICL_FT
View on GitHub
RuleRAG: Rule Meets Retrieval-Augmented Generation for Question Answering
☆32Oct 8, 2025Updated 5 months ago
miralab-ai / autoreason
View on GitHub
☆39Dec 14, 2024Updated last year
DAMO-NLP-SG / LongPO
View on GitHub
[ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
☆43Feb 27, 2025Updated last year
test-time-interaction / TTI
View on GitHub
☆72Jun 10, 2025Updated 9 months ago
GasolSun36 / DynamicRAG
View on GitHub
Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
☆63Sep 28, 2025Updated 5 months ago
Vision-CAIR / Infinibench
View on GitHub
Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows
☆19Nov 4, 2025Updated 4 months ago
chchenhui / mlrbench
View on GitHub
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research
☆22Sep 23, 2025Updated 5 months ago
Vinoground / Vinoground
View on GitHub
☆13Aug 7, 2025Updated 7 months ago
lt-asset / Waffle
View on GitHub
For ACL25 paper "WAFFLE: Multi-Modal Model for Automated Front-End Development" - by Shanchao Liang and Nan Jiang and Shangshu Qian and L…
☆11May 28, 2025Updated 9 months ago
DataArcTech / Simple-HHEA
View on GitHub
Toward Practical Entity Alignment Method Design: Insights from New Highly Heterogeneous Knowledge Graph Datasets
☆17Feb 18, 2025Updated last year
Yibin-Lei / CSQE
View on GitHub
Implementation for EACL 2024 paper "Corpus-Steered Query Expansion with Large Language Models"
☆12Mar 19, 2024Updated last year
jaehunjung1 / cascaded-selective-evaluation
View on GitHub
☆29Feb 24, 2025Updated last year
Tebmer / Rereading-LLM-Reasoning
View on GitHub
EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…
☆28Dec 10, 2024Updated last year
D2I-ai / Route
View on GitHub
ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL (ICLR 2025 Pytorch Code)
☆17May 15, 2025Updated 9 months ago
Sueqk / LMM-VQA
View on GitHub
LMM for VQA, tcsvt version
☆11Jul 19, 2024Updated last year
TsinghuaC3I / FS-GEN
View on GitHub
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.
☆13Nov 19, 2024Updated last year
chanchimin / AgentMonitor
View on GitHub
Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"
☆13Dec 13, 2024Updated last year
EvanZhuang / AgenticLU
View on GitHub
Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).
☆12Sep 22, 2025Updated 5 months ago
jinzhuoran / RAG-RewardBench
View on GitHub
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
☆16Dec 19, 2024Updated last year
UCSB-NLP-Chang / SelfDenoise
View on GitHub
☆14May 7, 2024Updated last year
ArminAzizi98 / LaMDA
View on GitHub
☆15Nov 7, 2024Updated last year
jxnl / instructor-classify
View on GitHub
☆37May 5, 2025Updated 10 months ago
icip-cas / awesome-auto-alignment
View on GitHub
Collection of papers for scalable automated alignment.
☆93Oct 22, 2024Updated last year
jiaangli / VLCA
View on GitHub
[TACL] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study
☆16Nov 22, 2024Updated last year
DataArcTech / ChartBench
View on GitHub
☆15May 15, 2025Updated 9 months ago
shimisalant / CWR
View on GitHub
Author implementation of "Contextualized Word Representations for Reading Comprehension" (Salant et al. 2017)
☆11Jun 14, 2018Updated 7 years ago
mukhal / ThinkPRM
View on GitHub
[TMLR] Process Reward Models That Think
☆81Nov 29, 2025Updated 3 months ago
cambridgeltl / zepo
View on GitHub
Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments (Zhou et al., EMNLP 2024)
☆14Oct 3, 2024Updated last year
webis-de / set-encoder
View on GitHub
Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders
☆18May 23, 2025Updated 9 months ago
xiye17 / TextualExplInContext
View on GitHub
The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning (NeurIPS 2022)
☆16Feb 11, 2023Updated 3 years ago
Marker-Inc-Korea / AutoRAG_ARAGOG_Paper
View on GitHub
☆21Jul 18, 2024Updated last year
pillowsofwind / Knowledge-Conflicts-Survey
View on GitHub
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
☆153Sep 21, 2024Updated last year
ggjy / vision_weak_to_strong
View on GitHub
☆38Feb 8, 2024Updated 2 years ago