patronus-ai / financebenchLinks
☆203Updated 8 months ago
Alternatives and similar repositories for financebench
Users that are interested in financebench are comparing it to the libraries listed below
Sorting:
- Comprehensive benchmark for RAG☆211Updated 2 months ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆109Updated last year
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆160Updated last year
- Official Implementation of "Multi-Head RAG: Solving Multi-Aspect Problems with LLMs"☆226Updated 2 months ago
- Knowledge Graph Retrieval Augmented Generation (KG-RAG) Eval Datasets☆172Updated last year
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆435Updated last year
- Automated Evaluation of RAG Systems☆647Updated 5 months ago
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆138Updated last week
- Repository for "MultiHop-RAG: A Dataset for Evaluating Retrieval-Augmented Generation Across Documents" (COLM 2024)☆356Updated 5 months ago
- Sample notebooks and prompts for LLM evaluation☆138Updated 2 months ago
- This is the repo for the LegalBench-RAG Paper: https://arxiv.org/abs/2408.10343.☆124Updated 3 months ago
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.☆525Updated last week
- Benchmarking library for RAG☆224Updated last month
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆233Updated 3 weeks ago
- ☆145Updated last year
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆196Updated 9 months ago
- Attribute (or cite) statements generated by LLMs back to in-context information.☆274Updated 10 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆114Updated this week
- This is an implementation of the paper: Searching for Best Practices in Retrieval-Augmented Generation (EMNLP2024)☆333Updated 8 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆176Updated 11 months ago
- Code repo for the ICML 2024 paper "Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation"☆81Updated last year
- Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".☆238Updated last year
- This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation.…☆395Updated 5 months ago
- The official repository for the paper: Evaluation of Retrieval-Augmented Generation: A Survey.☆173Updated 4 months ago
- Repository for “PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers”, NAACL24☆145Updated last year
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆114Updated last month
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models☆557Updated last year
- This is the official repository for Auto-RAG.☆218Updated last month
- Banishing LLM Hallucinations Requires Rethinking Generalization☆276Updated last year
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆48Updated last year