KbsdJames/Omni-MATH

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/KbsdJames/Omni-MATH)

KbsdJames / Omni-MATH

The official repository of the Omni-MATH benchmark.

☆94

Alternatives and similar repositories for Omni-MATH

Users that are interested in Omni-MATH are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

KbsdJames / omni-math-rule
View on GitHub
The rule-based evaluation subset and code implementation of Omni-MATH
☆28Dec 23, 2024Updated last year
KbsdJames / MATH-Minos
View on GitHub
The implementation of paper "LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Fee…
☆38Jul 25, 2024Updated 2 years ago
chenllliang / MMEvalPro
View on GitHub
[NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs
☆25Sep 26, 2024Updated last year
Yifan-Song793 / InfoCL
View on GitHub
Findings of EMNLP 2023: InfoCL: Alleviating Catastrophic Forgetting in Continual Text Classification from An Information Theoretic Perspe…
☆14Aug 13, 2024Updated last year
Yifan-Song793 / GoodBadGreedy
View on GitHub
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism
☆31Jul 17, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
WeiminXiong / RationaleCL
View on GitHub
Rationale-enhanced language models are better continual relation learners (EMNLP 2023 Main Conference)
☆12Oct 11, 2023Updated 2 years ago
F2-Song / ICDPO
View on GitHub
The official implementation of "ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization…
☆16Feb 15, 2024Updated 2 years ago
RenShuhuai-Andy / my-tools
View on GitHub
my commonly-used tools
☆64Jan 7, 2025Updated last year
koalazf99 / nanoverl
View on GitHub
Collections of RLxLM experiments using minimal codes
☆14Feb 17, 2025Updated last year
pkunlp-icler / SCL-RAI
View on GitHub
Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER" @COLING-2022
☆11Aug 20, 2022Updated 3 years ago
open-compass / MathBench
View on GitHub
[ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Dataset
☆116May 22, 2025Updated last year
KbsdJames / Awesome-LLM-Preference-Learning
View on GitHub
The official repository of our survey paper: "Towards a Unified View of Preference Learning for Large Language Models: A Survey"
☆192Oct 28, 2024Updated last year
GAIR-NLP / ReasonEval
View on GitHub
[AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy
☆80Oct 9, 2025Updated 9 months ago
F2-Song / Weak-to-Strong-Decoding
View on GitHub
The official implementation of "Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding"
☆22Jun 26, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
lancopku / MUKI
View on GitHub
[Findings of EMNLP22] From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models
☆19Mar 16, 2023Updated 3 years ago
TobiasLee / VEC
View on GitHub
Visual and Embodied Concepts evaluation benchmark
☆21Oct 10, 2023Updated 2 years ago
pkunlp-icler / PCA-EVAL
View on GitHub
[ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
☆107Mar 14, 2024Updated 2 years ago
chenllliang / ATP-AMR
View on GitHub
Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022
☆15Mar 31, 2023Updated 3 years ago
tongyx361 / Awesome-LLM4Math
View on GitHub
Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…
☆159Jul 12, 2024Updated 2 years ago
M3-IT / YING-VLM
View on GitHub
Vision Large Language Models trained on M3IT instruction tuning dataset
☆17Aug 16, 2023Updated 2 years ago
lancopku / DCKD
View on GitHub
Code and data for Distributional Correlation–Aware Knowledge Distillation for Stock Trading Volume Prediction (ECML-PKDD 22)
☆16Sep 6, 2022Updated 3 years ago
GAIR-NLP / OlympicArena
View on GitHub
[NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
☆106Mar 6, 2025Updated last year
chenllliang / ParetoMNMT
View on GitHub
Source code for paper "On the Pareto Front of Multilingual Neural Machine Translation" @ NeurIPS 2023
☆17Sep 27, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
protagolabs / odyssey-math
View on GitHub
☆84Jan 25, 2025Updated last year
kkk-an / UltraIF
View on GitHub
Code of EMNLP 2025 paper 'UltraIF: Advancing Instruction Following from the Wild'.
☆21Apr 3, 2025Updated last year
He-Ren / OJBench
View on GitHub
☆33Feb 28, 2026Updated 5 months ago
ChengpengLi1003 / DotaMath
View on GitHub
☆30Dec 27, 2024Updated last year
Wangpeiyi9979 / ESD
View on GitHub
Code for NAACL2022 Long Paper "An Enhanced Span-based Decomposition Method for Few-Shot Sequence Labeling"
☆27Nov 9, 2022Updated 3 years ago
chenllliang / MLS
View on GitHub
Source code of our paper "Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation" @ACL-2022
☆18May 19, 2022Updated 4 years ago
rookie-joe / AutoPSV
View on GitHub
☆50Oct 28, 2024Updated last year
WeiminXiong / IPR
View on GitHub
Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)
☆68Oct 18, 2024Updated last year
OpenBMB / OlympiadBench
View on GitHub
[ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scie…
☆195Jun 8, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
pkunlp-icler / IKE
View on GitHub
☆25Feb 27, 2023Updated 3 years ago
sarahmart / HARDMath
View on GitHub
A new dataset of difficult graduate-level applied mathematics problems; evaluations demonstrate that leading LLMs currently exhibit low a…
☆30Feb 14, 2025Updated last year
zhaoyu-li / DL4TP
View on GitHub
[COLM 2024] A Survey on Deep Learning for Theorem Proving
☆228May 28, 2025Updated last year
feifeibear / DPSKV3MFU
View on GitHub
Estimate MFU for DeepSeekV3
☆26Jan 5, 2025Updated last year
WeiminXiong / MPO
View on GitHub
MPO: Boosting LLM Agents with Meta Plan Optimization (EMNLP 2025 Findings)
☆81Aug 20, 2025Updated 11 months ago
LLM360 / MegaMath
View on GitHub
[COLM 2025] An Open Math Pre-trainng Dataset with 370B Tokens.
☆110Apr 4, 2025Updated last year
PKU-TANGENT / ConFiguRe
View on GitHub
Dataset and baseline for Coling 2022 long paper (oral): "ConFiguRe: Exploring Discourse-level Chinese Figures of Speech"
☆12Jul 27, 2023Updated 3 years ago