amazon-science / auto-rag-eval
Code repo for the ICML 2024 paper "Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation"
☆73Updated 10 months ago
Alternatives and similar repositories for auto-rag-eval:
Users that are interested in auto-rag-eval are comparing it to the libraries listed below
- RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Langua…☆361Updated 5 months ago
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆170Updated 4 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆128Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆77Updated 7 months ago
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆155Updated last year
- ☆174Updated 2 years ago
- ☆40Updated 8 months ago
- ☆36Updated last month
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆83Updated 8 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆106Updated 7 months ago
- ☆143Updated 9 months ago
- ☆43Updated 5 months ago
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆137Updated 4 months ago
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆422Updated last year
- Repository for “PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers”, NAACL24☆139Updated 10 months ago
- ☆145Updated last year
- ☆35Updated 9 months ago
- Attribute (or cite) statements generated by LLMs back to in-context information.☆228Updated 6 months ago
- Code for Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks☆55Updated last year
- Comprehensive benchmark for RAG☆170Updated 5 months ago
- Benchmarking library for RAG☆193Updated last week
- ☆53Updated 4 months ago
- ☆120Updated 6 months ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆101Updated last year
- [Preprint] Learning to Filter Context for Retrieval-Augmented Generaton☆192Updated last year
- RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation.☆126Updated last week
- A generative AI-powered framework for testing virtual agents.☆222Updated 3 weeks ago
- This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"☆203Updated 4 months ago
- Self-Reflection in LLM Agents: Effects on Problem-Solving Performance☆69Updated 5 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆108Updated last week