☆275Dec 3, 2024Updated last year
Alternatives and similar repositories for financebench
Users that are interested in financebench are comparing it to the libraries listed below
Sorting:
- KITE (Knowledge-Intensive Task Evaluation) is an end-to-end benchmark for RAG pipelines☆23Aug 14, 2024Updated last year
- Data and code for EMNLP 2021 paper "FinQA: A Dataset of Numerical Reasoning over Financial Data"☆358Jun 6, 2022Updated 3 years ago
- An OpenBB agent slack bot that is ready to answer any financial question☆12Feb 24, 2024Updated 2 years ago
- A package to parse SEC XBRL at scale.☆18Nov 25, 2025Updated 3 months ago
- KEDA External Scaler for Azure Cosmos DB☆11Mar 14, 2026Updated last week
- ☆24Oct 23, 2025Updated 4 months ago
- Research Artifact For Our Submission To VLDB☆10Oct 27, 2021Updated 4 years ago
- StAtutory Reasoning Assessment☆16Dec 8, 2022Updated 3 years ago
- Python code examples for accessing and analyzing SEC's XBRL Data Sets☆43Jan 21, 2026Updated 2 months ago
- Data and code for EMNLP 2022 paper "ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering"☆119Nov 9, 2022Updated 3 years ago
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆233Dec 2, 2024Updated last year
- ☆15Oct 30, 2021Updated 4 years ago
- Code for 'Contrastive Multi-Document Question Generation'☆11Oct 16, 2022Updated 3 years ago
- ☆43Jul 10, 2024Updated last year
- ☆16May 14, 2025Updated 10 months ago
- Prompt-Guided Retrieval For Non-Knowledge-Intensive Tasks☆12Sep 1, 2023Updated 2 years ago
- The PIZZA dataset continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, who…☆20Dec 7, 2022Updated 3 years ago
- This is a work in progress package that enables users to conduct fundamental financial research, utilising the SEC's EDGAR API.☆70Mar 2, 2026Updated 2 weeks ago
- Comprehensive benchmark for RAG☆272Jun 14, 2025Updated 9 months ago
- ☆21Oct 22, 2021Updated 4 years ago
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆14Mar 20, 2024Updated 2 years ago
- Data and Code for ACL 2024 paper "DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Docu…☆23Dec 21, 2024Updated last year
- LUNA: a Framework for Language Understanding and Naturalness Assessment.☆12Sep 9, 2023Updated 2 years ago
- Official code repository to the corresponding paper.☆29Sep 14, 2023Updated 2 years ago
- Code and dataset for the paper: Generating Literal and Implied Subquestions to Fact-check Complex Claims☆30May 30, 2023Updated 2 years ago
- This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning …☆838Mar 4, 2025Updated last year
- Python library containing BART query generation and BERT-based Siamese models for neural retrieval.☆40Oct 30, 2020Updated 5 years ago
- DEREK (Domain Entities and Relations Extraction Kit)☆10May 22, 2023Updated 2 years ago
- BERT score for text generation☆12Jan 15, 2025Updated last year
- The FinEval financial domain evaluation benchmark, based on quantitative fundamental methods and developed through long-term objective re…☆262Jun 23, 2025Updated 8 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆265Updated this week
- Data for paper "Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness"☆33May 3, 2023Updated 2 years ago
- WallStr.Chat is an AI research assistant for investment bankers, hedge funds, and PE firms, enabling parallel chat with dozens of PDFs, w…☆17Feb 8, 2026Updated last month
- Code Repository for "A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models".☆15Oct 14, 2022Updated 3 years ago
- This is the repo of developing reasoning models in the specific domain of financial, aim to enhance models capabilities in handling finan…☆72Jun 23, 2025Updated 8 months ago
- ☆18Mar 25, 2024Updated last year
- Expand -> Retrieve -> Rerank - simple method with strong results on BRIGHT benchmark☆22Aug 22, 2025Updated 7 months ago
- Python library to access and analyze SEC Edgar filings, XBRL financial statements, 10-K, 10-Q, and 8-K reports☆1,862Updated this week
- Structured pruning and bias visualization for Large Language Models. Tools for LLM optimization and fairness analysis.☆29Mar 14, 2026Updated last week