chuzhumin98 / PRELinks
A general framework used on evaluating the performance of large language models (LLMs) based on the peer review mechanism among LLMs
☆19Updated last year
Alternatives and similar repositories for PRE
Users that are interested in PRE are comparing it to the libraries listed below
Sorting:
- Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs (ACL 2024)☆73Updated 9 months ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆146Updated last month
- Open source code of the paper: "OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain"☆81Updated last year
- The demo, code and data of FollowRAG☆75Updated 7 months ago
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆172Updated last year
- ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented Generation☆57Updated last week
- CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation☆64Updated 8 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆151Updated last year
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)☆101Updated 11 months ago
- Code Repo for EfficientRAG: Efficient Retriever for Multi-Hop Question Answering☆64Updated 11 months ago
- Generative Judge for Evaluating Alignment☆250Updated 2 years ago
- The official repo for our paper: LegalAgentBench: Evaluating LLM Agents in Legal Domainl☆40Updated last year
- Code to reproduce THUIR‘s submissions for COLIEE 2023 Task1 and Task2☆28Updated 2 years ago
- A framework for editing the CoTs for better factuality☆50Updated 2 years ago
- [ICLR 2025] This is the code repo for our ICLR’25 paper "RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rew…☆50Updated last year
- RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation.☆144Updated last month
- [ICLR 2025] BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆189Updated 4 months ago
- A curated list of resources dedicated to retrieval-augmented generation (RAG).☆128Updated 3 months ago
- Implementation of the paper: "Making Retrieval-Augmented Language Models Robust to Irrelevant Context"☆75Updated last year
- ResearcherBench: Evaluating Deep AI Research Systems on the Frontiers of Scientific Inquiry☆45Updated last month
- The code and data of DPA-RAG, accepted by WWW 2025 main conference.☆63Updated 3 months ago
- YuLan-IR: Information Retrieval Boosted LMs☆220Updated last year
- [ACL 2023] This is the code repo for our ACL'23 paper "Augmentation-Adapted Retriever Improves Generalization of Language Models as Gener…☆60Updated last year
- The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>☆341Updated last year
- Repository for Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions, ACL23☆249Updated last year
- [ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆136Updated last year
- [CIKM 2025] Constraint Back-translation Improves Complex Instruction Following of Large Language Models☆17Updated 8 months ago
- ☆218Updated last year
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆73Updated 8 months ago
- 🌲 Code for our EMNLP 2023 paper - 🎄 "Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Mode…☆54Updated 2 years ago