A new dataset of difficult graduate-level applied mathematics problems; evaluations demonstrate that leading LLMs currently exhibit low accuracy in solving these problems.
☆26Feb 14, 2025Updated last year
Alternatives and similar repositories for HARDMath
Users that are interested in HARDMath are comparing it to the libraries listed below
Sorting:
- The rule-based evaluation subset and code implementation of Omni-MATH☆26Dec 23, 2024Updated last year
- Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…☆13Aug 8, 2025Updated 6 months ago
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆72Feb 25, 2025Updated last year
- ☆79Nov 19, 2024Updated last year
- ☆16Jul 7, 2025Updated 7 months ago
- This is an example server for AudioConnector to be used by Genesys Cloud customers to help get them acquainted with the AudioConnector Pr…☆16Jan 2, 2026Updated 2 months ago
- ☆85Jan 25, 2025Updated last year
- LaTeX Beamer template crafted for University of Illinois Chicago☆11Dec 7, 2024Updated last year
- [ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scie…☆183Jun 8, 2025Updated 8 months ago
- The implementation of paper "LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Fee…☆37Jul 25, 2024Updated last year
- Simulation, multi-path estimation, and CBR parsing code of SIGCOMM2023 BeamSense CBR-Sensing☆10Jan 14, 2024Updated 2 years ago
- SAM Template with Lambda Function to spin up a DynamoDB backed Movies API and attach APIGW Resource Policy to it.☆13Jun 12, 2018Updated 7 years ago
- A Controllable Model of Grounded Response Generation (AAAI 21)☆13Oct 25, 2022Updated 3 years ago
- [AAAI 2025] Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems☆12May 5, 2025Updated 9 months ago
- ☆14Oct 21, 2024Updated last year
- ☆13Mar 25, 2022Updated 3 years ago
- ☆13Jun 25, 2025Updated 8 months ago
- Code for ICML 2022 paper: Achieving Fairness at No Utility Cost via Data Reweighing with Influence☆11Aug 3, 2022Updated 3 years ago
- Implementation of a compact optical neural network SqueezeLight based on multi-operand micro-rings, DATE 2021☆13Oct 26, 2022Updated 3 years ago
- ☆11Jun 5, 2024Updated last year
- Dataset for AAAI paper "Natural Language Inference in Context - Investigating Contextual Reasoning over Long Texts"☆11Nov 18, 2022Updated 3 years ago
- Code and data to support Bamman et al. (2020), "A Dataset of Literary Coreference" (LREC)☆10Dec 8, 2022Updated 3 years ago
- This repository includes the implementation and results of the paper "ChatGPT is fun, but it is not funny! Humor is still challenging Lar…☆13Jul 13, 2023Updated 2 years ago
- Code for "ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer"☆15Jul 17, 2024Updated last year
- ☆14Oct 21, 2025Updated 4 months ago
- Use Amazon Lex as a conversational interface with Twilio Media Streams☆13Feb 20, 2026Updated last week
- The official repository for the CodeGym project: "Generalizable End-to-End Tool-Use RL with Synthetic CodeGym"☆23Oct 14, 2025Updated 4 months ago
- Code of the paper "Synthesizing Aspect-Driven Recommendation Explanations from Reviews", IJCAI'20☆10Apr 5, 2024Updated last year
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning☆57Oct 10, 2025Updated 4 months ago
- 一个用 ChatGPT 生成命令行的小玩具☆10Mar 7, 2023Updated 2 years ago
- ☆15Dec 2, 2025Updated 3 months ago
- The official repository for the paper entitled "Time Travel in LLMs: Tracing Data Contamination in Large Language Models."☆12Jun 11, 2024Updated last year
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆116Feb 9, 2024Updated 2 years ago
- [ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Dataset☆111May 22, 2025Updated 9 months ago
- piggybacking on the Dafny language implementation to explore interactive semi-automated verified program synthesis, combining LLMs and sy…☆16Feb 20, 2026Updated last week
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 4 months ago
- Ensemble Neural Representation Networks☆12Jan 5, 2022Updated 4 years ago
- Convert MathML to Latex for OneNote to Markdown☆12Jul 27, 2022Updated 3 years ago
- 微信机器人,用于群聊成员查询token详情信息☆10Mar 25, 2025Updated 11 months ago