OpenLMLab/GAOKAO-Bench-Updates

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OpenLMLab/GAOKAO-Bench-Updates)

OpenLMLab / GAOKAO-Bench-Updates

GAOGAO-Bench-Updates is a supplement to the GAOKAO-Bench, a dataset to evaluate large language models.

☆47

Alternatives and similar repositories for GAOKAO-Bench-Updates

Users that are interested in GAOKAO-Bench-Updates are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OpenMOSS / GAOKAO-MM
View on GitHub
[ACL'2024 Findings] GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models Evaluation
☆82Mar 13, 2024Updated 2 years ago
OpenLMLab / GAOKAO-Bench
View on GitHub
GAOKAO-Bench is an evaluation framework that utilizes GAOKAO questions as a dataset to evaluate large language models.
☆779Jan 7, 2025Updated last year
llmeval / Llmeval-Gaokao2024-Math
View on GitHub
LLM evaluation on 2024 Chinese Gaokao Mathematics — zero-contamination benchmark with dual prompt formats
☆21Apr 15, 2026Updated 3 months ago
logikon-ai / cot-eval
View on GitHub
A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.
☆19Feb 6, 2025Updated last year
XL2248 / CPCC
View on GitHub
Code and Data for the ACL21 paper "Modeling Bilingual Conversational Characteristics for Neural Chat Translation"
☆12Dec 17, 2021Updated 4 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
Twilight92z / Quantize-Watermark
View on GitHub
☆19Nov 6, 2023Updated 2 years ago
QinJinghui / SAU-Solver
View on GitHub
☆14Jul 22, 2021Updated 5 years ago
nex-agi / NexHTML
View on GitHub
HTML Agent based on NexAU
☆16Nov 20, 2025Updated 8 months ago
facebookresearch / dmae_st
View on GitHub
Directed masked autoencoders
☆14Mar 25, 2026Updated 3 months ago
OpenLMLab / ParallelTokenizer
View on GitHub
Use the tokenizer in parallel to achieve superior acceleration
☆20Mar 21, 2024Updated 2 years ago
nex-agi / weaver
View on GitHub
Python SDK for Weaver.
☆17Updated this week
qijimrc / mm_evaluation
View on GitHub
☆11Aug 4, 2024Updated last year
flageval-baai / HalluDial
View on GitHub
☆21Aug 19, 2024Updated last year
BAAI-WuDao / LegalPLMs
View on GitHub
Source code and checkpoints for legal pre-trained language models.
☆14May 9, 2021Updated 5 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
GAIR-NLP / OlympicArena
View on GitHub
[NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
☆106Mar 6, 2025Updated last year
lauhaide / clads
View on GitHub
XWikisCorpus, cross-lingual summarisation, multi-lingual summarisation, pre-trained language models, zero-shot and few-shot summarisation…
☆10Nov 4, 2022Updated 3 years ago
NoSyu / VHUCM
View on GitHub
Implementation of Variational Hierarchical User-based Conversation Model
☆10Jul 2, 2021Updated 5 years ago
tsuruoka-lab / AMI-Meeting-Parallel-Corpus
View on GitHub
AMI Meeting Parallel Corpus
☆13Dec 11, 2020Updated 5 years ago
open-compass / GAOKAO-Eval
View on GitHub
☆125Oct 7, 2025Updated 9 months ago
microsoft / AVGen-Bench
View on GitHub
[ICML26] AVGen-Bench is a task-driven benchmark for multi-granular evaluation of Text-to-Audio-Video (T2AV) generation.
☆22Jul 2, 2026Updated 3 weeks ago
OpenMOSS / LongLLaDA
View on GitHub
[AAAI26] LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs
☆55Dec 7, 2025Updated 7 months ago
HM-RunningHub / ComfyUI_RH_MOVA
View on GitHub
This is a ComfyUI plugin for https://github.com/OpenMOSS/MOVA
☆22Jan 30, 2026Updated 5 months ago
MARIO-Math-Reasoning / MARIO_EVAL
View on GitHub
☆52Mar 5, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
korokes / MCLS
View on GitHub
Assist Non-native Viewers: Multimodal Crosslingual Summarization for How2 Videos
☆10Sep 2, 2024Updated last year
ruili33 / SEC
View on GitHub
Source code for paper Are Human-generated Demonstrations Necessary for In-context Learning
☆12Jan 21, 2024Updated 2 years ago
OpenMOSS / claude-codex-handoff
View on GitHub
Drop-in async file-based handoff protocol for two AI coding agents (Claude Code + Codex), installed as one shared .handoff/ in your proje…
☆30Jul 4, 2026Updated 2 weeks ago
JHart96 / keras_gcn_sequence_labelling
View on GitHub
Keras implementation of graph convolutional networks for sequence labelling
☆12Sep 21, 2018Updated 7 years ago
open-compass / MathBench
View on GitHub
[ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Dataset
☆115May 22, 2025Updated last year
THUDM / BatchSampler
View on GitHub
The source code for BatchSampler that accepted in KDD'23
☆19Aug 9, 2023Updated 2 years ago
ibraheem-moosa / mt-ranker
View on GitHub
Code for the ICLR'24 paper: MT-RANKER : Reference-free machine translation evaluation by inter-system ranking
☆10Feb 29, 2024Updated 2 years ago
OpenMOSS / ABC-Bench
View on GitHub
ABC-Bench is a benchmark for Agentic Backend Coding. It evaluates whether code agents can explore real repositories, edit code, configure…
☆33Jan 20, 2026Updated 6 months ago
CLUEbenchmark / Math24o
View on GitHub
Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark
☆14Mar 27, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
nex-agi / NexA4A
View on GitHub
Nex Agent for Agent is a meta-agent system that automatically creates specialized AI agents based on natural language requirements.
☆29Nov 18, 2025Updated 8 months ago
xhh678876 / openclaw-sjtu
View on GitHub
🎓 上海交通大学全能 AI 助手 — 基于 OpenClaw 的交大校园 Skill 包 | 19项功能覆盖 DDL/选课/邮箱/食堂/图书馆/PPT生成
☆66May 30, 2026Updated last month
titu1994 / simple_diffusion
View on GitHub
Simple notebooks to learn diffusion models on toy datasets
☆17Feb 9, 2023Updated 3 years ago
YLXDXX / AM601-kaoyan
View on GitHub
中国科学院大学，601高等数学甲，历年考研真题收集整理
☆13Aug 4, 2025Updated 11 months ago
kevinyaobytedance / llm_eval
View on GitHub
LLM evaluation.
☆16Nov 7, 2023Updated 2 years ago
IsakZhang / XABSA
View on GitHub
☆10Nov 29, 2021Updated 4 years ago
fandongmeng / DTMT_InDec
View on GitHub
Implementation of DTMT with incremental decoding
☆13Feb 20, 2021Updated 5 years ago