amazon-science / auto-rag-eval
Code repo for the ICML 2024 paper "Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation"
☆63Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for auto-rag-eval
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆122Updated 7 months ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆78Updated 3 months ago
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆131Updated 10 months ago
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆114Updated last month
- Retrieval Augmented Generation Generalized Evaluation Dataset☆51Updated last month
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆96Updated 6 months ago
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle☆121Updated this week
- Repository for "MultiHop-RAG: A Dataset for Evaluating Retrieval-Augmented Generation Across Documents" (COLM 2024)☆194Updated 3 months ago
- RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Langua…☆300Updated this week
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆128Updated this week
- Codebase accompanying the Summary of a Haystack paper.☆72Updated last month
- RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation.☆96Updated 4 months ago
- ☆111Updated last month
- ☆131Updated 3 months ago
- Banishing LLM Hallucinations Requires Rethinking Generalization☆261Updated 3 months ago
- awesome synthetic (text) datasets☆239Updated 2 weeks ago
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆114Updated this week
- ☆100Updated 2 months ago
- RAGChecker: A Fine-grained Framework For Diagnosing RAG☆531Updated last month
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆105Updated last week
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆105Updated 3 weeks ago
- minimal pytorch implementation of bm25 (with sparse tensors)☆88Updated 8 months ago
- Attribute (or cite) statements generated by LLMs back to in-context information.☆142Updated last month
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.☆160Updated last year
- Knowledge Graph Retrieval Augmented Generation (KG-RAG) Eval Datasets☆126Updated 7 months ago
- Corrective Retrieval Augmented Generation☆293Updated last month
- Code for paper "G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment"☆258Updated 9 months ago
- Automated Evaluation of RAG Systems☆482Updated last week
- Finetune mistral-7b-instruct for sentence embeddings☆70Updated 6 months ago
- Let's build better datasets, together!☆202Updated 3 months ago