The FinEval financial domain evaluation benchmark, based on quantitative fundamental methods and developed through long-term objective research, summarization, and rigorous manual screening, utilizes over 26,000 diverse question types that are highly consistent with real-world application scenarios.
☆274Jun 23, 2025Updated 11 months ago
Alternatives and similar repositories for FinEval
Users that are interested in FinEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆261Dec 25, 2023Updated 2 years ago
- IDEAFinBench: 金融知识评估基准☆15Apr 8, 2024Updated 2 years ago
- DISC-FinLLM,中文金融大语言模型(LLM),旨在为用户提供金融场景下专业、智能、全面的金融咨询服务。DISC-FinLLM, a Chinese financial large language model (LLM) designed to provide us…☆882Nov 1, 2023Updated 2 years ago
- Fin-R1 is a large language model for complex financial reasoning developed and open-sourced with the joint efforts of the SUFE-AIFLM-Lab …☆796Mar 27, 2025Updated last year
- ☆284Jul 10, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- When FLUE Meets FLANG: Benchmarks and Large Pretrained Language Model for Financial Domain☆57Feb 11, 2025Updated last year
- 轩辕:度小满中文金融对话大模型☆1,322Jan 7, 2025Updated last year
- Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]☆1,854Jul 27, 2025Updated 10 months ago
- Chinese Generation Evaluation☆13Aug 14, 2023Updated 2 years ago
- This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning …☆866Mar 4, 2025Updated last year
- FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目,利用开源开放来促进「AI+金融」。☆2,237May 8, 2024Updated 2 years ago
- Shaping Language Models with Cognitive Insights☆15Feb 29, 2024Updated 2 years ago
- Repository for "Zero is Not Hero Yet: Benchmarking Zero-Shot Performance of LLMs for Financial Tasks"☆26Jul 31, 2023Updated 2 years ago
- CFinBench: A Comprehensive Chinese Financial Benchmark for Large Language Models☆16Oct 14, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 智鹿:中文消金领域对话大模型☆30Nov 12, 2023Updated 2 years ago
- A Large-Scale Dataset for Long Text and Multi-Table Summarization☆18Feb 21, 2024Updated 2 years ago
- ☆17Jul 10, 2023Updated 2 years ago
- 面向中文大模型价值观的评估与对齐研究☆556Jul 20, 2023Updated 2 years ago
- An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.☆299May 31, 2023Updated 3 years ago
- 通义点金:中文金融行业大模型 Resources☆91Jan 7, 2025Updated last year
- Data and code for ACL 2022 paper "MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data"☆54Oct 22, 2024Updated last year
- Chinese Financial Assistant Benchmark for Large Language Model☆55Jul 30, 2025Updated 10 months ago
- A large-scale 7B pretraining language model developed by BaiChuan-Inc.☆5,654Jul 18, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆77Dec 14, 2024Updated last year
- 聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)☆657Jun 30, 2023Updated 2 years ago
- AgentTuning: Enabling Generalized Agent Abilities for LLMs☆1,497Oct 31, 2023Updated 2 years ago
- BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)☆8,275Oct 16, 2024Updated last year
- OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, …☆7,075Jun 5, 2026Updated last week
- ☆16Jan 23, 2026Updated 4 months ago
- LAiW: A Chinese Legal Large Language Models Benchmark☆92Jul 3, 2024Updated last year
- SuperCLUE-Agent: 基于中文原生任务的Agent智能体核心能力测评基准☆94Nov 9, 2023Updated 2 years ago
- ☆25Jun 19, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Dataset and codes for SEntFiN☆10May 31, 2023Updated 3 years ago
- A 13B large language model developed by Baichuan Intelligent Technology☆2,930Sep 6, 2023Updated 2 years ago
- Official code and data of the ACL 2022 paper "Program Transfer for Complex Question Answering over Knowledge Bases"☆14May 4, 2022Updated 4 years ago
- A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)☆3,492Feb 8, 2026Updated 4 months ago
- ☆326Dec 3, 2024Updated last year
- Dataset published in paper "FinRED: A Dataset for Relation Extraction in Financial Domain"☆30Apr 15, 2022Updated 4 years ago
- ☆98Dec 5, 2023Updated 2 years ago