CSHaitao / Awesome-LLMs-as-JudgesLinks
The official repo for paper, LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods.
☆384Updated 6 months ago
Alternatives and similar repositories for Awesome-LLMs-as-Judges
Users that are interested in Awesome-LLMs-as-Judges are comparing it to the libraries listed below
Sorting:
- ☆363Updated 3 weeks ago
- Controllable Text Generation for Large Language Models: A Survey☆179Updated 9 months ago
- [ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.☆167Updated 2 weeks ago
- A recipe for online RLHF and online iterative DPO.☆520Updated 5 months ago
- Recipes to train the self-rewarding reasoning LLMs.☆223Updated 3 months ago
- LLM hallucination paper list☆318Updated last year
- ☆117Updated 3 months ago
- Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models☆464Updated last week
- Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"☆498Updated 5 months ago
- Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`☆179Updated 6 months ago
- Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasonin…☆169Updated 6 months ago
- ☆573Updated 3 weeks ago
- ✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork☆231Updated 2 weeks ago
- ☆241Updated 2 weeks ago
- Awesome Agent Training☆164Updated this week
- This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025).☆305Updated last month
- Recipes to train reward model for RLHF.☆1,380Updated 2 months ago
- Benchmarking LLMs via Uncertainty Quantification☆232Updated last year
- Official implementation of RARE: Retrieval-Augmented Reasoning Modeling☆181Updated 3 weeks ago
- ☆242Updated last month
- 😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond☆252Updated 2 weeks ago
- ☆300Updated 3 weeks ago
- Latest Advances on Long Chain-of-Thought Reasoning☆390Updated 3 weeks ago
- This includes the original implementation of CtrlA: Adaptive Retrieval-Augmented Generation via Inherent Control.☆61Updated 8 months ago
- ☆139Updated 3 months ago
- ☆208Updated last month
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆357Updated 9 months ago
- Generative Judge for Evaluating Alignment☆239Updated last year
- ☆178Updated 2 months ago
- ☆203Updated 4 months ago