patronus-ai / financebenchLinks

☆184

Alternatives and similar repositories for financebench

Users that are interested in financebench are comparing it to the libraries listed below

Sorting:

chentong0 / factoid-wiki
Dense X Retrieval: What Retrieval Granularity Should We Use?
☆160Updated last year
predlico / ARAGOG
ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…
☆107Updated last year
facebookresearch / CRAG
Comprehensive benchmark for RAG
☆198Updated last month
spcl / MRAG
Official Implementation of "Multi-Head RAG: Solving Multi-Aspect Problems with LLMs"
☆220Updated last month
zeroentropy-ai / legalbenchrag
This is the repo for the LegalBench-RAG Paper: https://arxiv.org/abs/2408.10343.
☆105Updated last month
apple / ml-superposition-prompting
☆145Updated 11 months ago
docugami / KG-RAG-datasets
Knowledge Graph Retrieval Augmented Generation (KG-RAG) Eval Datasets
☆164Updated last year
microsoft / llm-data-creation
Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"
☆135Updated last year
alopatenko / LLMEvaluation
A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…
☆123Updated last week
yixuantt / MultiHop-RAG
Repository for "MultiHop-RAG: A Dataset for Evaluating Retrieval-Augmented Generation Across Documents" (COLM 2024)
☆343Updated 3 months ago
stephenleo / llm-structured-output-benchmarks
Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…
☆173Updated 9 months ago
ParticleMedia / RAGTruth
Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"
☆191Updated 7 months ago
rungalileo / hallucination-index
Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.
☆111Updated 10 months ago
TIGER-AI-Lab / LongRAG
Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".
☆235Updated 10 months ago
MadryLab / context-cite
Attribute (or cite) statements generated by LLMs back to in-context information.
☆245Updated 9 months ago
salesforce / summary-of-a-haystack
Codebase accompanying the Summary of a Haystack paper.
☆79Updated 9 months ago
YHPeter / Awesome-RAG-Evaluation
The official repository for the paper: Evaluation of Retrieval-Augmented Generation: A Survey.
☆162Updated 2 months ago
Liyan06 / MiniCheck
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]
☆168Updated 6 months ago
myeon9h / PlanRAG
Repository for “PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers”, NAACL24
☆142Updated last year
CYQIQ / MultiCoT
Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph
☆144Updated last year
stanford-futuredata / ARES
Automated Evaluation of RAG Systems
☆624Updated 3 months ago
brandonstarxel / chunking_evaluation
This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation.…
☆346Updated 4 months ago
castorini / rank_llm
RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
☆494Updated last week
zetaalphavector / RAGElo
RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker
☆113Updated last week
ritun16 / chain-of-verification
This repository implements the chain of verification paper by Meta AI
☆171Updated last year
sauravjoshi23 / towards-agi
A collection of personally developed projects contributing towards the advancement of Artificial General Intelligence(AGI)
☆127Updated last year
wang-research-lab / agentinstruct
Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"
☆113Updated 10 months ago
KarelDO / xmc.dspy
In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.
☆432Updated last year
FudanDNN-NLP / RAG
This is an implementation of the paper: Searching for Best Practices in Retrieval-Augmented Generation (EMNLP2024)
☆327Updated 6 months ago
davanstrien / awesome-synthetic-datasets
awesome synthetic (text) datasets
☆289Updated last week