math-eval / TAL-SCQ5K
☆147Updated last year
Alternatives and similar repositories for TAL-SCQ5K:
Users that are interested in TAL-SCQ5K are comparing it to the libraries listed below
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关文章记录。☆80Updated 6 months ago
- deep learning☆150Updated 2 weeks ago
- XVERSE-65B: A multilingual large language model developed by XVERSE Technology Inc.☆139Updated 11 months ago
- Gaokao Benchmark for AI☆108Updated 2 years ago
- SuperCLUE-Agent: 基于中文原生任务的Agent智能体核心能力测评基准☆83Updated last year
- SuperCLUE-Math6:新一代中文原生多轮多步数学推理数据集的探索之旅☆54Updated last year
- CodeGPT: A Code-Related Dialogue Dataset Generated by GPT and for GPT☆113Updated last year
- SOTA Math Opensource LLM☆330Updated last year
- 国内首个全参数训练的法律大模型 HanFei-1.0 (韩非)☆114Updated last year
- 1st Solution For Conversational Multi-Doc QA Workshop & International Challenge @ WSDM'24 - Xiaohongshu.Inc☆162Updated last year
- ☆142Updated 8 months ago
- MathEval is a benchmark dedicated to the holistic evaluation on mathematical capacities of LLMs.☆74Updated 4 months ago
- 本项 目致力于为大模型领域的初学者提供全面的知识体系,包括基础和高阶内容,以便开发者能迅速掌握大模型技术栈并全面了解相关知识。☆51Updated 2 months ago
- ☆105Updated 4 months ago
- 大模型多维度中文对齐评测基准 (ACL 2024)☆367Updated 7 months ago
- ☆81Updated 11 months ago
- ☆224Updated 4 months ago
- Light local website for displaying performances from different chat models.☆85Updated last year
- A Massive Multi-Level Multi-Subject Knowledge Evaluation benchmark☆100Updated last year
- ☆95Updated last year
- ☆64Updated last year
- ☆128Updated last year
- ☆205Updated last year
- ☆88Updated 11 months ago
- 旨在对当前主流LLM进行一个直观、具体、标准的评测☆94Updated last year
- 怎么训练一个LLM分词器☆142Updated last year
- 基于baichuan-7b的开源多模态大语言模型☆73Updated last year
- FlagEval is an evaluation toolkit for AI large foundation models.☆327Updated 8 months ago
- 用于大模型 RLHF 进行人工数据标注排序的工具。A tool for manual response data annotation sorting in RLHF stage.☆249Updated last year
- Just for debug☆56Updated last year