amazon-science / auto-rag-eval
Code repo for the ICML 2024 paper "Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation"
☆70Updated 7 months ago
Alternatives and similar repositories for auto-rag-eval:
Users that are interested in auto-rag-eval are comparing it to the libraries listed below
- RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Langua…☆336Updated 2 months ago
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆132Updated last month
- ☆126Updated last month
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆82Updated 5 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆124Updated 10 months ago
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆63Updated 11 months ago
- Comprehensive benchmark for RAG☆99Updated 2 months ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆101Updated 9 months ago
- ☆137Updated 5 months ago
- Repository for "MultiHop-RAG: A Dataset for Evaluating Retrieval-Augmented Generation Across Documents" (COLM 2024)☆242Updated last month
- ☆40Updated 2 months ago
- Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"☆121Updated last year
- ☆115Updated 3 months ago
- [NAACL'24] Dataset, code and models for "TableLlama: Towards Open Large Generalist Models for Tables".☆122Updated 8 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆106Updated 3 weeks ago
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆141Updated last year
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆97Updated 4 months ago
- ☆149Updated 5 months ago
- RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation.☆117Updated 6 months ago
- awesome synthetic (text) datasets☆253Updated 2 months ago
- Codebase accompanying the Summary of a Haystack paper.☆75Updated 3 months ago
- ☆65Updated 9 months ago
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆54Updated 2 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆99Updated 9 months ago
- Knowledge Graph Retrieval Augmented Generation (KG-RAG) Eval Datasets☆141Updated 9 months ago
- Automated Evaluation of RAG Systems☆526Updated 2 months ago
- ☆34Updated 5 months ago
- ACL2023 - AlignScore, a metric for factual consistency evaluation.☆119Updated 10 months ago
- Finetune mistral-7b-instruct for sentence embeddings☆74Updated 8 months ago
- ☆124Updated this week