chuzhumin98 / PRELinks
A general framework used on evaluating the performance of large language models (LLMs) based on the peer review mechanism among LLMs
☆19Updated last year
Alternatives and similar repositories for PRE
Users that are interested in PRE are comparing it to the libraries listed below
Sorting:
- [ACL 2023] This is the code repo for our ACL'23 paper "Augmentation-Adapted Retriever Improves Generalization of Language Models as Gener…☆60Updated last year
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆143Updated last year
- Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversation☆29Updated 9 months ago
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆164Updated last year
- Open source code of the paper: "OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain"☆78Updated 11 months ago
- RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation.☆143Updated 7 months ago
- Code for Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks (WWW 2024))☆58Updated 3 weeks ago
- Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs (ACL 2024)☆72Updated 7 months ago
- CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation☆64Updated 6 months ago
- YuLan-IR: Information Retrieval Boosted LMs☆222Updated last year
- [ICLR 2025] This is the code repo for our ICLR’25 paper "RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rew…☆48Updated 10 months ago
- Repository for Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions, ACL23☆240Updated last year
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)☆97Updated 9 months ago
- Code to reproduce THUIR‘s submissions for COLIEE 2023 Task1 and Task2☆27Updated 2 years ago
- https://acl2023-retrieval-lm.github.io/☆157Updated 2 years ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆150Updated last year
- Source code for EMNLP 2023 paper "Probabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex Questions".☆23Updated last year
- Official repository for paper "TableBench: A Comprehensive and Complex Benchmark for Table Question Answering"☆76Updated 7 months ago
- A curated list of awesome papers about information retrieval(IR) in the age of large language model(LLM). These include retrieval augment…☆77Updated last year
- The code and data of DPA-RAG, accepted by WWW 2025 main conference.☆63Updated last month
- Implementation of the paper: "Making Retrieval-Augmented Language Models Robust to Irrelevant Context"☆75Updated last year
- Generative Judge for Evaluating Alignment☆248Updated last year
- A framework for editing the CoTs for better factuality☆50Updated 2 years ago
- Self-Knowledge Guided Retrieval Augmentation for Large Language Models (EMNLP Findings 2023)☆28Updated 2 years ago
- ☆212Updated last year
- [ICLR 2025] BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆179Updated 3 months ago
- Test-time compute in information retrieval☆47Updated 5 months ago
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆71Updated 7 months ago
- Code and data for "The Power of Noise: Redefining Retrieval for RAG Systems"☆68Updated 5 months ago
- Code Repo for EfficientRAG: Efficient Retriever for Multi-Hop Question Answering☆62Updated 9 months ago