☆316Dec 3, 2024Updated last year
Alternatives and similar repositories for financebench
Users that are interested in financebench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- KITE (Knowledge-Intensive Task Evaluation) is an end-to-end benchmark for RAG pipelines☆23Aug 14, 2024Updated last year
- Data and code for EMNLP 2021 paper "FinQA: A Dataset of Numerical Reasoning over Financial Data"☆374Jun 6, 2022Updated 3 years ago
- how to build up Knowledge graph☆13Nov 16, 2021Updated 4 years ago
- An OpenBB agent slack bot that is ready to answer any financial question☆12Feb 24, 2024Updated 2 years ago
- KEDA External Scaler for Azure Cosmos DB☆11Updated this week
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆26Oct 23, 2025Updated 6 months ago
- Data and code for EMNLP 2022 paper "ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering"☆123Nov 9, 2022Updated 3 years ago
- ☆15Mar 26, 2025Updated last year
- ☆23Mar 6, 2024Updated 2 years ago
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆247Dec 2, 2024Updated last year
- Measuring RAG solutions throughput and latency☆20Jul 23, 2024Updated last year
- Code for 'Contrastive Multi-Document Question Generation'☆11Oct 16, 2022Updated 3 years ago
- ☆45Jul 10, 2024Updated last year
- Prompt-Guided Retrieval For Non-Knowledge-Intensive Tasks☆12Sep 1, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The PIZZA dataset continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, who…☆20Dec 7, 2022Updated 3 years ago
- ☆21Oct 22, 2021Updated 4 years ago
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆14Mar 20, 2024Updated 2 years ago
- Comprehensive benchmark for RAG☆286Jun 14, 2025Updated 11 months ago
- Data and Code for ACL 2024 paper "DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Docu…☆23Dec 21, 2024Updated last year
- LUNA: a Framework for Language Understanding and Naturalness Assessment.☆12Sep 9, 2023Updated 2 years ago
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.☆2,181Oct 16, 2025Updated 7 months ago
- ☆14Oct 17, 2024Updated last year
- Python library containing BART query generation and BERT-based Siamese models for neural retrieval.☆40Oct 30, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning …☆862Mar 4, 2025Updated last year
- DEREK (Domain Entities and Relations Extraction Kit)☆10May 22, 2023Updated 2 years ago
- Outline to Story: Fine-grained Controllable Story Generation from Cascaded Events☆18Jun 16, 2022Updated 3 years ago
- BERT score for text generation☆12Jan 15, 2025Updated last year
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆272Mar 25, 2026Updated last month
- Apify's reusable github workflows☆15May 14, 2026Updated last week
- code associated with WANLI dataset in Liu et al., 2022☆30May 24, 2023Updated 2 years ago
- (ICML 2025) Rethinking Chain-of-Thought from the Perspective of Self-Training☆13Feb 15, 2025Updated last year
- 天池算法比赛《BetterMixture - 大模型数据混合挑战赛》的第一名top1解决方案☆33Jul 7, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This is the repo of developing reasoning models in the specific domain of financial, aim to enhance models capabilities in handling finan…☆75Jun 23, 2025Updated 10 months ago
- Resources for paper "DialSummEval: Revisiting summarization evaluation for dialogues"☆14Jul 22, 2025Updated 10 months ago
- Scaling Agentic Environments Automatically.☆63Mar 26, 2026Updated last month
- ☆59Jun 7, 2024Updated last year
- Vectors analytics and search library using dispersion models. Provides graph analysis, vector search and a energy-distribution stats for …☆35May 12, 2026Updated last week
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Mar 18, 2023Updated 3 years ago
- Structured pruning and bias visualization for Large Language Models. Tools for LLM optimization and fairness analysis.☆39Updated this week