stanford-futuredata / ARES
Automated Evaluation of RAG Systems
β569Updated this week
Alternatives and similar repositories for ARES:
Users that are interested in ARES are comparing it to the libraries listed below
- RAGChecker: A Fine-grained Framework For Diagnosing RAGβ815Updated 3 months ago
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.β419Updated this week
- Evaluate your LLM's response with Prometheus and GPT4 π―β893Updated 2 weeks ago
- Corrective Retrieval Augmented Generationβ359Updated 5 months ago
- Repository for "MultiHop-RAG: A Dataset for Evaluating Retrieval-Augmented Generation Across Documents" (COLM 2024)β286Updated 4 months ago
- Forward-Looking Active REtrieval-augmented generation (FLARE)β621Updated last year
- The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrievalβ1,153Updated 7 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β1,359Updated 2 weeks ago
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Modelsβ504Updated 9 months ago
- Fine-Tuning Embedding for RAG with Synthetic Dataβ489Updated last year
- β855Updated 5 months ago
- This is an implementation of the paper: Searching for Best Practices in Retrieval-Augmented Generation (EMNLP2024)β305Updated 3 months ago
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.β763Updated last month
- The official repository for the paper: Evaluation of Retrieval-Augmented Generation: A Survey.β143Updated 5 months ago
- Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".β228Updated 7 months ago
- RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Languaβ¦β357Updated 4 months ago
- Generative Representational Instruction Tuningβ613Updated 2 weeks ago
- Code for explaining and evaluating late chunking (chunked pooling)β355Updated 3 months ago
- HyDE: Precise Zero-Shot Dense Retrieval without Relevance Labelsβ524Updated 3 months ago
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipyβ1,078Updated last week
- Efficient Retrieval Augmentation and Generation Frameworkβ1,500Updated 2 months ago
- [EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627β478Updated 5 months ago
- Is ChatGPT Good at Search? LLMs as Re-Ranking Agent [EMNLP 2023 Outstanding Paper Award]β585Updated last year
- [EMNLP 2024: Demo Oral] RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generationβ293Updated 5 months ago
- This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,β¦β2,012Updated 10 months ago
- β504Updated 4 months ago
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.β415Updated last year
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"β166Updated 4 months ago
- Official Implementation of "Multi-Head RAG: Solving Multi-Aspect Problems with LLMs"β202Updated 5 months ago
- Automatically evaluate your LLMs in Google Colabβ613Updated 10 months ago