SUFE-AIFLM-Lab/FinEval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SUFE-AIFLM-Lab/FinEval)

SUFE-AIFLM-Lab / FinEval

The FinEval financial domain evaluation benchmark, based on quantitative fundamental methods and developed through long-term objective research, summarization, and rigorous manual screening, utilizes over 26,000 diverse question types that are highly consistent with real-world application scenarios.

☆283

Alternatives and similar repositories for FinEval

Users that are interested in FinEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

alipay / financial_evaluation_dataset
View on GitHub
☆263Dec 25, 2023Updated 2 years ago
SUFE-AIFLM-Lab / FinGAIA
View on GitHub
☆24Oct 29, 2025Updated 8 months ago
DataArcTech / IDEAFinBench
View on GitHub
IDEAFinBench: 金融知识评估基准
☆16Apr 8, 2024Updated 2 years ago
FudanDISC / DISC-FinLLM
View on GitHub
DISC-FinLLM，中文金融大语言模型（LLM），旨在为用户提供金融场景下专业、智能、全面的金融咨询服务。DISC-FinLLM, a Chinese financial large language model (LLM) designed to provide us…
☆890Nov 1, 2023Updated 2 years ago
SUFE-AIFLM-Lab / Fin-R1
View on GitHub
Fin-R1 is a large language model for complex financial reasoning developed and open-sourced with the joint efforts of the SUFE-AIFLM-Lab …
☆809Mar 27, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
supersymmetry-technologies / BBT-FinCUGE-Applications
View on GitHub
☆286Jul 10, 2023Updated 3 years ago
SALT-NLP / FLANG
View on GitHub
When FLUE Meets FLANG: Benchmarks and Large Pretrained Language Model for Financial Domain
☆57Feb 11, 2025Updated last year
Duxiaoman-DI / XuanYuan
View on GitHub
轩辕：度小满中文金融对话大模型
☆1,323Jan 7, 2025Updated last year
Felixgithub2017 / CG-Eval
View on GitHub
Chinese Generation Evaluation
☆13Aug 14, 2023Updated 2 years ago
hkust-nlp / ceval
View on GitHub
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
☆1,862Jul 27, 2025Updated 11 months ago
MetaGLM / FinGLM
View on GitHub
FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目，利用开源开放来促进「AI+金融」。
☆2,256May 8, 2024Updated 2 years ago
dwzq-com-cn / DongwuLLM
View on GitHub
This is the codebase for pre-training, compressing, extending, and distilling LLMs with Megatron-LM.
☆12Mar 11, 2024Updated 2 years ago
The-FinAI / FinBen
View on GitHub
☆15May 19, 2025Updated last year
gtfintechlab / zero-shot-finance
View on GitHub
Repository for "Zero is Not Hero Yet: Benchmarking Zero-Shot Performance of LLMs for Financial Tasks"
☆26Jul 31, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
aliyun / cflue
View on GitHub
通义点金：中文金融行业大模型 Resources
☆91Jan 7, 2025Updated last year
SYSU-MUCFC-FinTech-Research-Center / ZhiLu
View on GitHub
智鹿：中文消金领域对话大模型
☆30Nov 12, 2023Updated 2 years ago
whunextgen / LLMindCraft
View on GitHub
Shaping Language Models with Cognitive Insights
☆15Feb 29, 2024Updated 2 years ago
swtheing / LLM-Performance-Improvement-Paper
View on GitHub
☆17Jul 10, 2023Updated 3 years ago
X-PLUG / CValues
View on GitHub
面向中文大模型价值观的评估与对齐研究
☆560Jul 20, 2023Updated 3 years ago
pyRis / SEntFiN
View on GitHub
Dataset and codes for SEntFiN
☆10May 31, 2023Updated 3 years ago
OFA-Sys / ExpertLLaMA
View on GitHub
An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.
☆298May 31, 2023Updated 3 years ago
TongjiFinLab / CFBenchmark
View on GitHub
Chinese Financial Assistant Benchmark for Large Language Model
☆55Jul 30, 2025Updated 11 months ago
psunlpgroup / MultiHiertt
View on GitHub
Data and code for ACL 2022 paper "MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data"
☆54Oct 22, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
valuesimplex / FinLongEval
View on GitHub
☆77Dec 14, 2024Updated last year
baichuan-inc / Baichuan-7B
View on GitHub
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
☆5,650Jul 18, 2024Updated 2 years ago
jerry1993-tech / Cornucopia-LLaMA-Fin-Chinese
View on GitHub
聚宝盆(Cornucopia): 中文金融系列开源可商用大模型，并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)
☆659Jun 30, 2023Updated 3 years ago
THUDM / AgentTuning
View on GitHub
AgentTuning: Enabling Generalized Agent Abilities for LLMs
☆1,500Oct 31, 2023Updated 2 years ago
LianjiaTech / BELLE
View on GitHub
BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）
☆8,277Oct 16, 2024Updated last year
open-compass / opencompass
View on GitHub
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, …
☆7,235Updated this week
BWY-02 / CFinBench-Eval
View on GitHub
CFinBench: A Comprehensive Chinese Financial Benchmark for Large Language Models
☆16Oct 14, 2024Updated last year
Dai-shen / LAiW
View on GitHub
LAiW: A Chinese Legal Large Language Models Benchmark
☆90Jul 3, 2024Updated 2 years ago
CLUEbenchmark / SuperCLUE-Agent
View on GitHub
SuperCLUE-Agent: 基于中文原生任务的Agent智能体核心能力测评基准
☆95Nov 9, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mlfoundations / VisIT-Bench
View on GitHub
☆51Oct 29, 2023Updated 2 years ago
baichuan-inc / Baichuan-13B
View on GitHub
A 13B large language model developed by Baichuan Intelligent Technology
☆2,930Sep 6, 2023Updated 2 years ago
rajdeep345 / ECTSum
View on GitHub
Dataset and Codes for our EMNLP 2022 Main Conference Long Paper titled "ECTSum: A New Benchmark Dataset For Bullet Point Summarization of…
☆34May 22, 2024Updated 2 years ago
THUDM / AgentBench
View on GitHub
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
☆3,601Feb 8, 2026Updated 5 months ago
soummyaah / FinRED
View on GitHub
Dataset published in paper "FinRED: A Dataset for Relation Extraction in Financial Domain"
☆29Apr 15, 2022Updated 4 years ago
LC1332 / Learn-Python-with-GPT
View on GitHub
李鲁鲁老师的 Copilot-Python 学习。和ChatGPT等大语言模型协同进化。
☆10Jun 3, 2025Updated last year
facebookresearch / adaptive_scheduling
View on GitHub
Experimental scripts for researching data adaptive learning rate scheduling.
☆23Oct 18, 2023Updated 2 years ago