A new dataset of difficult graduate-level applied mathematics problems; evaluations demonstrate that leading LLMs currently exhibit low accuracy in solving these problems.
☆28Feb 14, 2025Updated last year
Alternatives and similar repositories for HARDMath
Users that are interested in HARDMath are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The rule-based evaluation subset and code implementation of Omni-MATH☆27Dec 23, 2024Updated last year
- The official repository of the Omni-MATH benchmark.☆93Dec 22, 2024Updated last year
- ☆30Dec 27, 2024Updated last year
- Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…☆13Aug 8, 2025Updated 8 months ago
- ☆85Jan 25, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆73Feb 25, 2025Updated last year
- Code accompanying our NeurIPS 2020 traffic4cast challenge☆14Oct 4, 2021Updated 4 years ago
- ☆23Jan 31, 2025Updated last year
- Kaggle AIMO2 solution with token-efficient reasoning LLM recipes☆46Aug 7, 2025Updated 8 months ago
- Repository for the paper: Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning☆18Feb 21, 2025Updated last year
- ☆14Oct 21, 2024Updated last year
- ☆11Jun 5, 2024Updated last year
- ☆57Jun 23, 2025Updated 9 months ago
- ☆11Jul 15, 2020Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Modern development with Python in 2024☆12Mar 30, 2026Updated last week
- [ICLR2026] The official repository for the CodeGym project: "Generalizable End-to-End Tool-Use RL with Synthetic CodeGym"☆27Oct 14, 2025Updated 5 months ago
- The implementation of paper "LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Fee…☆38Jul 25, 2024Updated last year
- [ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scie…☆188Jun 8, 2025Updated 10 months ago
- [AAAI 2025] Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems☆13May 5, 2025Updated 11 months ago
- ☆19Jun 4, 2024Updated last year
- LaTeX Beamer template crafted for University of Illinois Chicago☆11Dec 7, 2024Updated last year
- ☆14Apr 16, 2025Updated 11 months ago
- Convert MathML to Latex for OneNote to Markdown☆12Mar 17, 2026Updated 3 weeks ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- !!!!(DEMO)!!!! !!! CHECK OUT THE NEW VERSİON !!! Counting Close People with Yolov7☆13Sep 14, 2022Updated 3 years ago
- A curated list of PhD, RA, and Intern openings in Computer Science (CS), Electrical & Computer Engineering (ECE), and Artificial Intellig…☆21Sep 1, 2025Updated 7 months ago
- Multimodal Compact Bilinear Pooling class in Python☆11Sep 17, 2019Updated 6 years ago
- ☆52Oct 5, 2020Updated 5 years ago
- A C project template with support for CMake and Unity test framework☆11Jun 12, 2018Updated 7 years ago
- DGCIT: Double Generative Adversarial Networks for Conditional Independence Testing☆11Nov 22, 2023Updated 2 years ago
- As defined in Lubotzky, Philips and Sarnak☆10Oct 25, 2022Updated 3 years ago
- ☆79Nov 19, 2024Updated last year
- [IJCAI 2023] The official repo of paper 'Automatic Truss Design with Reinforcement Learning'☆19Jun 19, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Code and data to support Bamman et al. (2020), "A Dataset of Literary Coreference" (LREC)☆10Dec 8, 2022Updated 3 years ago
- ☆16Feb 23, 2024Updated 2 years ago
- Codes for coreference-aware machine reading comprehension☆13Mar 13, 2022Updated 4 years ago
- ☆14Jul 25, 2024Updated last year
- ☆15Jul 1, 2020Updated 5 years ago
- [ICML2025] Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"☆22Feb 16, 2025Updated last year
- [ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Dataset☆112May 22, 2025Updated 10 months ago