CLUEbenchmark/SuperCLUE-Math6

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CLUEbenchmark/SuperCLUE-Math6)

CLUEbenchmark / SuperCLUE-Math6

SuperCLUE-Math6：新一代中文原生多轮多步数学推理数据集的探索之旅

☆60

Alternatives and similar repositories for SuperCLUE-Math6

Users that are interested in SuperCLUE-Math6 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pkunlp-icler / SCL-RAI
View on GitHub
Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER" @COLING-2022
☆11Aug 20, 2022Updated 3 years ago
allenai / unifew
View on GitHub
Unifew: Unified Fewshot Learning Model
☆18Sep 10, 2021Updated 4 years ago
renll / SparseLT
View on GitHub
[EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing
☆14Feb 10, 2023Updated 3 years ago
Chenny0808 / ape210k
View on GitHub
This is the repository of the Ape210K dataset and baseline models.
☆202Dec 10, 2019Updated 6 years ago
llan-ml / MetaTNE
View on GitHub
Source code for NeurIPS 2020 paper "Node Classification on Graphs with Few-Shot Novel Labels via Meta Transformed Network Embedding"
☆10Nov 17, 2020Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
qijimrc / mm_evaluation
View on GitHub
☆11Aug 4, 2024Updated last year
OpenLMLab / GAOKAO-Bench
View on GitHub
GAOKAO-Bench is an evaluation framework that utilizes GAOKAO questions as a dataset to evaluate large language models.
☆779Jan 7, 2025Updated last year
llmeval / Llmeval-Gaokao2024-Math
View on GitHub
LLM evaluation on 2024 Chinese Gaokao Mathematics — zero-contamination benchmark with dual prompt formats
☆21Apr 15, 2026Updated 3 months ago
layumi / To-Academic-Newcomers
View on GitHub
☆10Jan 20, 2021Updated 5 years ago
NTAIX / Chinese-Python-QA-Dataset
View on GitHub
An Annotated Question Answering Dataset for Assisting Chinese Python Programming Learners
☆10Feb 23, 2024Updated 2 years ago
YJiangcm / FollowBench
View on GitHub
[ACL 2024] FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models
☆118Jun 12, 2025Updated last year
mathllm / MATH-V
View on GitHub
[NeurIPS 2024] MATH-Vision dataset and code to measure multimodal mathematical reasoning capabilities.
☆139May 16, 2025Updated last year
weijia-xu / fairseq-editor
View on GitHub
EDITOR: an Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints
☆29Dec 21, 2021Updated 4 years ago
CLUEbenchmark / SuperCLUE-RAG
View on GitHub
中文原生检索增强生成测评基准
☆131Apr 18, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
AI-EDU-LAB / E-EVAL
View on GitHub
Official github repo for E-Eval, a Chinese K12 education evaluation benchmark for LLMs.
☆32Feb 19, 2024Updated 2 years ago
QinJinghui / NS-Solver
View on GitHub
☆19Jul 15, 2022Updated 4 years ago
KelleyYin / XLM-Plus
View on GitHub
☆10Oct 15, 2020Updated 5 years ago
THUDM / paper-source-trace
View on GitHub
☆19Sep 29, 2024Updated last year
BinWangGzhu / SVLL-ReID
View on GitHub
☆14Aug 15, 2025Updated 11 months ago
WailordHe / cv-arxiv-daily-wailord
View on GitHub
🎓Automatically Update CV Papers Daily using Github Actions (Update Every 12th hours)
☆12May 17, 2026Updated 2 months ago
aadityasingh / HARP
View on GitHub
☆22Jan 31, 2025Updated last year
project-numina / aimo-progress-prize
View on GitHub
☆495Jul 22, 2024Updated last year
KbsdJames / omni-math-rule
View on GitHub
The rule-based evaluation subset and code implementation of Omni-MATH
☆28Dec 23, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
GaiYu0 / QDGAT
View on GitHub
Question-Directed Graph Attention Network for Numerical Reasoning over Text
☆10Aug 14, 2020Updated 5 years ago
meowpass / FollowComplexInstruction
View on GitHub
Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…
☆55Jun 24, 2024Updated 2 years ago
MARIO-Math-Reasoning / Super_MARIO
View on GitHub
☆341Jun 5, 2025Updated last year
Sleepychord / cogdata
View on GitHub
A light-weight data management system for large-scale pretraining
☆21May 17, 2025Updated last year
THU-KEG / VerIF
View on GitHub
[EMNLP 2025] Verification Engineering for RL in Instruction Following
☆57Mar 30, 2026Updated 3 months ago
thu-coai / ComplexBench
View on GitHub
Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)
☆102Feb 20, 2025Updated last year
THU-KEG / Crab
View on GitHub
[CIKM 2025] Constraint Back-translation Improves Complex Instruction Following of Large Language Models
☆18May 23, 2025Updated last year
carrierlxk / DSLT
View on GitHub
Deep Regression Tracking with Shrinkage Loss (ECCV 2018).
☆12Sep 3, 2020Updated 5 years ago
LindgeW / BiaffineNER
View on GitHub
A structured parsing technique for NER
☆15May 26, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ExpressAI / AI-Gaokao
View on GitHub
Gaokao Benchmark for AI
☆109Jul 8, 2022Updated 4 years ago
Jack-ZC8 / M3AV-dataset
View on GitHub
[ACL 2024] A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset
☆24May 29, 2025Updated last year
ConiferLM / Conifer
View on GitHub
Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models
☆91Apr 4, 2024Updated 2 years ago
NanshineLoong / Self-Evolving-Benchmark
View on GitHub
A framework for evolving and testing question-answering datasets with various models.
☆26Feb 28, 2024Updated 2 years ago
roeeaharoni / sprp-acl2018
View on GitHub
Source code and data for "Split and Rephrase: Better Evaluation and a Stronger Baseline"
☆15Feb 15, 2019Updated 7 years ago
CLUEbenchmark / SuperCLUE-Code3
View on GitHub
中文原生等级化代码能力测试基准
☆15Apr 11, 2024Updated 2 years ago
sufengniu / RefGPT
View on GitHub
☆164Apr 17, 2023Updated 3 years ago