Felixgithub2017/MMCU

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Felixgithub2017/MMCU)

Felixgithub2017 / MMCU

MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING

☆90

Alternatives and similar repositories for MMCU

Users that are interested in MMCU are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MikeGu721 / XiezhiBenchmark
View on GitHub
☆98Dec 5, 2023Updated 2 years ago
tjunlp-lab / M3KE
View on GitHub
A Massive Multi-Level Multi-Subject Knowledge Evaluation benchmark
☆106Jul 20, 2023Updated 3 years ago
WeOpenML / PandaLM
View on GitHub
☆927May 22, 2024Updated 2 years ago
chenzen94 / debug-deepspeed-chat
View on GitHub
Debug DeepSpeed-Chat step by step in IDE (在IDE里一步一步调试DeepSpeed-Chat)
☆10Apr 17, 2023Updated 3 years ago
ruixiangcui / AGIEval
View on GitHub
☆774Jun 13, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
LianjiaTech / BELLE
View on GitHub
BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）
☆8,273Oct 16, 2024Updated last year
hkust-nlp / ceval
View on GitHub
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
☆1,860Jul 27, 2025Updated 11 months ago
CLUEbenchmark / ZeroCLUE
View on GitHub
零样本学习测评基准，中文版
☆59Jun 23, 2021Updated 5 years ago
haonan-li / CMMLU
View on GitHub
CMMLU: Measuring massive multitask language understanding in Chinese
☆828Dec 6, 2024Updated last year
GTCOM-NLP / GeWu-BigModel
View on GitHub
格物-多语言和中文大规模预训练模型-轻量版，涵盖纯中文、知识增强、113个语种多语言，采用主流Roberta架构，适用于NLU和NLG任务，支持pytorch、tensorflow、uer、huggingface等框架。 Multilingual and Chinese …
☆30Nov 17, 2022Updated 3 years ago
sufengniu / RefGPT
View on GitHub
☆164Apr 17, 2023Updated 3 years ago
THUIR / T2Ranking
View on GitHub
T2Ranking: A large-scale Chinese benchmark for passage ranking.
☆161Jul 3, 2023Updated 3 years ago
JiangXiaElves / ZhenHuanBot
View on GitHub
使用甄嬛传剧本数据训练Bloomz模型，实现以甄嬛口吻对不同人物进行不同回答的Bot
☆13Jun 10, 2023Updated 3 years ago
GanjinZero / RRHF
View on GitHub
[NIPS2023] RRHF & Wombat
☆806Sep 22, 2023Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Felixgithub2017 / CG-Eval
View on GitHub
Chinese Generation Evaluation
☆13Aug 14, 2023Updated 2 years ago
OpenMOSS / HalluQA
View on GitHub
Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"
☆139Jun 5, 2024Updated 2 years ago
OpenLMLab / ChatZoo
View on GitHub
Light local website for displaying performances from different chat models.
☆86Nov 13, 2023Updated 2 years ago
vyraun / long-tailed
View on GitHub
Code for "On Long-Tailed Phenomena in NMT".
☆10Jan 10, 2021Updated 5 years ago
CLUEbenchmark / pCLUE
View on GitHub
pCLUE: 1000000+多任务提示学习数据集
☆509Oct 4, 2022Updated 3 years ago
xionghonglin / DoctorGLM
View on GitHub
基于ChatGLM-6B的中文问诊模型
☆834Oct 19, 2023Updated 2 years ago
princeton-nlp / WhatICLLearns
View on GitHub
[ACL 2023 Findings] What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning
☆21Jul 9, 2023Updated 3 years ago
LaVi-Lab / CLEVA
View on GitHub
[EMNLP 2023 Demo] "CLEVA: Chinese Language Models EVAluation Platform"
☆64May 16, 2025Updated last year
OpenLMLab / GAOKAO-Bench
View on GitHub
GAOKAO-Bench is an evaluation framework that utilizes GAOKAO questions as a dataset to evaluate large language models.
☆778Jan 7, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
spraakbanken / multiged-2023
View on GitHub
☆15Apr 12, 2023Updated 3 years ago
CVI-SZU / Linly
View on GitHub
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型；ChatFlow中文对话模型；中文OpenLLaMA模型；NLP预训练/指令微调数据集
☆3,045Apr 14, 2024Updated 2 years ago
RUCKBReasoning / DSM
View on GitHub
☆17Jan 5, 2023Updated 3 years ago
songmzhang / CBMI
View on GitHub
The code of ACL2022 paper "Conditional Bilingual Mutual Information based Adaptive Training for Neural Machine Translation"..
☆14Aug 6, 2022Updated 3 years ago
XueFuzhao / InstructionWild
View on GitHub
☆462Jun 9, 2024Updated 2 years ago
FreedomIntelligence / LLMZoo
View on GitHub
⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡
☆2,941Nov 26, 2023Updated 2 years ago
Blue-Raincoat / SelectIT
View on GitHub
☆24Oct 14, 2024Updated last year
mcao516 / Factual-Error-Correction
View on GitHub
☆23May 26, 2022Updated 4 years ago
ZhaofengWu / counterfactual-evaluation
View on GitHub
☆58May 19, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
CLUEbenchmark / FewCLUE
View on GitHub
FewCLUE 小样本学习测评基准，中文版
☆517Sep 21, 2022Updated 3 years ago
THUIR / THUIR-website
View on GitHub
THUIR website
☆10Feb 23, 2026Updated 4 months ago
ydli-ai / CSL
View on GitHub
[COLING 2022] CSL: A Large-scale Chinese Scientific Literature Dataset 中文科学文献数据集
☆673Jun 19, 2023Updated 3 years ago
CLUEbenchmark / SuperCLUE
View on GitHub
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
☆3,296Feb 6, 2026Updated 5 months ago
KodCode-AI / code-r1
View on GitHub
Reproducing R1 for Code with Reliable Rewards
☆13Apr 9, 2025Updated last year
THU-KEG / KoLA
View on GitHub
[ICLR24] The open-source repo of THU-KEG's KoLA benchmark.
☆57Sep 28, 2023Updated 2 years ago
princeton-nlp / Cognac
View on GitHub
Repo for paper: Controllable Text Generation with Language Constraints
☆20Jun 20, 2023Updated 3 years ago