☆171Oct 12, 2025Updated 4 months ago
Alternatives and similar repositories for LLM-as-a-Judge
Users that are interested in LLM-as-a-Judge are comparing it to the libraries listed below
Sorting:
- Short RL☆18May 26, 2025Updated 9 months ago
- ☆112Nov 7, 2024Updated last year
- ☆530Jul 25, 2025Updated 7 months ago
- ☆16Jul 23, 2024Updated last year
- ☆52Feb 12, 2025Updated last year
- ☆20Jun 7, 2020Updated 5 years ago
- RuleRAG: Rule Meets Retrieval-Augmented Generation for Question Answering☆32Oct 8, 2025Updated 5 months ago
- ☆39Dec 14, 2024Updated last year
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆43Feb 27, 2025Updated last year
- ☆72Jun 10, 2025Updated 9 months ago
- Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation☆63Sep 28, 2025Updated 5 months ago
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆19Nov 4, 2025Updated 4 months ago
- MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research☆22Sep 23, 2025Updated 5 months ago
- ☆13Aug 7, 2025Updated 7 months ago
- For ACL25 paper "WAFFLE: Multi-Modal Model for Automated Front-End Development" - by Shanchao Liang and Nan Jiang and Shangshu Qian and L…☆11May 28, 2025Updated 9 months ago
- Toward Practical Entity Alignment Method Design: Insights from New Highly Heterogeneous Knowledge Graph Datasets☆17Feb 18, 2025Updated last year
- Implementation for EACL 2024 paper "Corpus-Steered Query Expansion with Large Language Models"☆12Mar 19, 2024Updated last year
- ☆29Feb 24, 2025Updated last year
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆28Dec 10, 2024Updated last year
- ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL (ICLR 2025 Pytorch Code)☆17May 15, 2025Updated 9 months ago
- LMM for VQA, tcsvt version☆11Jul 19, 2024Updated last year
- Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.☆13Nov 19, 2024Updated last year
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆13Dec 13, 2024Updated last year
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆12Sep 22, 2025Updated 5 months ago
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- ☆14May 7, 2024Updated last year
- ☆15Nov 7, 2024Updated last year
- ☆37May 5, 2025Updated 10 months ago
- Collection of papers for scalable automated alignment.☆93Oct 22, 2024Updated last year
- [TACL] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- ☆15May 15, 2025Updated 9 months ago
- Author implementation of "Contextualized Word Representations for Reading Comprehension" (Salant et al. 2017)☆11Jun 14, 2018Updated 7 years ago
- [TMLR] Process Reward Models That Think☆81Nov 29, 2025Updated 3 months ago
- Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments (Zhou et al., EMNLP 2024)☆14Oct 3, 2024Updated last year
- Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders☆18May 23, 2025Updated 9 months ago
- The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning (NeurIPS 2022)☆16Feb 11, 2023Updated 3 years ago
- ☆21Jul 18, 2024Updated last year
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆153Sep 21, 2024Updated last year
- ☆38Feb 8, 2024Updated 2 years ago