llm-as-a-judge / Awesome-LLM-as-a-judge
☆218Updated 3 weeks ago
Alternatives and similar repositories for Awesome-LLM-as-a-judge:
Users that are interested in Awesome-LLM-as-a-judge are comparing it to the libraries listed below
- ☆412Updated last week
- LLM hallucination paper list☆307Updated 11 months ago
- A Survey on Data Selection for Language Models☆213Updated 4 months ago
- EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.☆131Updated last year
- ☆81Updated this week
- ☆260Updated 7 months ago
- The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>☆333Updated 10 months ago
- A series of technical report on Slow Thinking with LLM☆438Updated this week
- [NeurIPS 2024] Agent Planning with World Knowledge Model☆114Updated 2 months ago
- Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`☆162Updated 3 months ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆126Updated last week
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆124Updated 8 months ago
- Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"☆464Updated last month
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆105Updated 5 months ago
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆221Updated 2 weeks ago
- RewardBench: the first evaluation tool for reward models.☆516Updated this week
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆143Updated 5 months ago
- Survey of Small Language Models from Penn State, ...☆161Updated last month
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆111Updated 3 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆176Updated 10 months ago
- Repository for Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions, ACL23☆191Updated 8 months ago
- Collection of training data management explorations for large language models☆311Updated 7 months ago
- Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]☆535Updated 2 months ago
- [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement☆176Updated 11 months ago
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆107Updated 7 months ago
- ☆128Updated last year
- Survey on LLM Agents (Published on CoLing 2025)☆105Updated last month
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆203Updated 4 months ago
- Awesome papers for role-playing with language models☆167Updated 4 months ago
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆149Updated 3 months ago