OpenMOSS/GAOKAO-MM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OpenMOSS/GAOKAO-MM)

OpenMOSS / GAOKAO-MM

[ACL'2024 Findings] GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models Evaluation

☆82

Alternatives and similar repositories for GAOKAO-MM

Users that are interested in GAOKAO-MM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OpenLMLab / GAOKAO-Bench-Updates
View on GitHub
GAOGAO-Bench-Updates is a supplement to the GAOKAO-Bench, a dataset to evaluate large language models.
☆47Jan 7, 2025Updated last year
OpenLMLab / GAOKAO-Bench
View on GitHub
GAOKAO-Bench is an evaluation framework that utilizes GAOKAO questions as a dataset to evaluate large language models.
☆779Jan 7, 2025Updated last year
OpenMOSS / Thus-Spake-Long-Context-LLM
View on GitHub
a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation
☆62Mar 31, 2025Updated last year
sii-research / OpenMOSS
View on GitHub
OpenMOSS presents a collection of our research on LLMs, supported by SII, Fudan and Mosi.
☆30Updated this week
xinghaow99 / prism
View on GitHub
[ICML 2026] Prism: Spectral-Aware Block-Sparse Attention
☆27May 22, 2026Updated 2 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
ECNU-ICALK / EduChat-Math
View on GitHub
[MM 2025] CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models
☆56Oct 20, 2024Updated last year
Callione / LLaVA-MOSS2
View on GitHub
Modified LLaVA framework for MOSS2, and makes MOSS2 a multimodal model.
☆13Sep 19, 2024Updated last year
Twilight92z / Quantize-Watermark
View on GitHub
☆19Nov 6, 2023Updated 2 years ago
OpenMOSS / HalluQA
View on GitHub
Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"
☆139Jun 5, 2024Updated 2 years ago
pengshuai-rin / MultiMath
View on GitHub
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models
☆33Jan 22, 2025Updated last year
open-compass / GAOKAO-Eval
View on GitHub
☆125Oct 7, 2025Updated 9 months ago
xinghaow99 / pbs-attn
View on GitHub
[ICML 2026] Sparser Block-Sparse Attention via Token Permutation
☆31May 22, 2026Updated 2 months ago
RUC-NLPIR / RAG-Reading-List
View on GitHub
RAG methods, benchmarks, and toolkits
☆19Nov 28, 2024Updated last year
KbsdJames / Omni-MATH
View on GitHub
The official repository of the Omni-MATH benchmark.
☆94Dec 22, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ExpressAI / AI-Gaokao
View on GitHub
Gaokao Benchmark for AI
☆109Jul 8, 2022Updated 4 years ago
RUC-NLPIR / ClawTrojan
View on GitHub
From Prompt Injection to Persistent Control: Defending Agentic Workspaces Against Trojan Backdoors
☆18Jun 1, 2026Updated last month
iNLP-Lab / reading-group
View on GitHub
☆18Jun 17, 2026Updated last month
sieve-community / describe
View on GitHub
Incredibly descriptive audiovisual summaries for videos
☆40Aug 2, 2024Updated last year
OpenMOSS / VehicleWorld
View on GitHub
VehicleWorld is the first comprehensive multi-device environment for intelligent vehicle interaction that accurately models the complex, …
☆24Sep 16, 2025Updated 10 months ago
0nutation / SLMTokBench
View on GitHub
SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
☆37Aug 29, 2023Updated 2 years ago
InternScience / TrustGeoGen
View on GitHub
Official repository for "TrustGeoGen: Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving"
☆23Sep 1, 2025Updated 10 months ago
hkust-nlp / deepsearch-tts
View on GitHub
Pushing Test-Time Scaling Limits of Deep Search with Asymmetric Verification
☆21Oct 8, 2025Updated 9 months ago
0nutation / SpeechGPT2.github.io
View on GitHub
☆12Jul 23, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
hkust-nlp / RL-Verifier-Robustness
View on GitHub
From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.
☆24Oct 7, 2025Updated 9 months ago
OpenMOSS / MOSS-Video-Preview
View on GitHub
A real-time video understanding foundation model with gated cross-attention. Offline & real-time inference.
☆162Jul 16, 2026Updated last week
KbsdJames / omni-math-rule
View on GitHub
The rule-based evaluation subset and code implementation of Omni-MATH
☆28Dec 23, 2024Updated last year
yxzwang / FamilyTool
View on GitHub
FamilyTool benchmark
☆14Sep 10, 2025Updated 10 months ago
alibaba / vstyle
View on GitHub
☆34Sep 15, 2025Updated 10 months ago
OpenMOSS / claude-codex-handoff
View on GitHub
Drop-in async file-based handoff protocol for two AI coding agents (Claude Code + Codex), installed as one shared .handoff/ in your proje…
☆30Jul 4, 2026Updated 2 weeks ago
RUC-NLPIR / Tool-Light
View on GitHub
Toward Effective Tool-Integrated Reasoning via Self-Evolved Preference Learning
☆34Sep 30, 2025Updated 9 months ago
OpenMOSS / Lorsa
View on GitHub
☆30Nov 9, 2025Updated 8 months ago
PromptLabs / hackaprompt
View on GitHub
☆21Dec 9, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
RUC-NLPIR / DeepImageSearch
View on GitHub
☆86May 2, 2026Updated 2 months ago
felixludos / alphageometry
View on GitHub
☆13Oct 10, 2024Updated last year
RUC-NLPIR / iAgent
View on GitHub
Including 12+ cutting-edge agent systems across multiple research directions
☆35Nov 10, 2025Updated 8 months ago
xinghaow99 / DenoSent
View on GitHub
[AAAI 2024] DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning
☆15Apr 29, 2024Updated 2 years ago
plageon / HierSearch
View on GitHub
HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searches
☆40Oct 9, 2025Updated 9 months ago
chtsy / buol
View on GitHub
☆48Oct 29, 2025Updated 8 months ago
BotPlayers / BotPlayers
View on GitHub
Play with agents and more.
☆22Sep 18, 2023Updated 2 years ago