wenhuchen / TheoremQAView external linksLinks
The dataset and code for paper: TheoremQA: A Theorem-driven Question Answering dataset
☆160Apr 23, 2024Updated last year
Alternatives and similar repositories for TheoremQA
Users that are interested in TheoremQA are comparing it to the libraries listed below
Sorting:
- The official repo for "TheoremQA: A Theorem-driven Question Answering dataset" (EMNLP 2023)☆38May 15, 2024Updated last year
- ☆130Jul 8, 2024Updated last year
- Data and Code for Program of Thoughts [TMLR 2023]☆306May 15, 2024Updated last year
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆270Sep 12, 2024Updated last year
- Code for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs. ACL 2023.☆64Nov 27, 2024Updated last year
- Implementation of the paper: "Turning Tables: Generating Examples from Semi-structured Tables for Endowing Language Models with Reasoning…☆22Nov 2, 2021Updated 4 years ago
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆61Jan 27, 2025Updated last year
- Resources of deep learning for mathematical reasoning (DL4MATH).☆370Dec 22, 2023Updated 2 years ago
- ☆30Dec 27, 2024Updated last year
- Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback☆207May 24, 2023Updated 2 years ago
- ☆27Sep 11, 2024Updated last year
- Codes and Pre-trained models for RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training [ACM MM 202…☆29Nov 2, 2023Updated 2 years ago
- [AAAI 2025] Augmenting Math Word Problems via Iterative Question Composing (https://arxiv.org/abs/2401.09003)☆23Oct 2, 2025Updated 4 months ago
- Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"☆75Mar 3, 2022Updated 3 years ago
- This repository contains a collection of papers and resources on Reasoning in Large Language Models.☆567Nov 13, 2023Updated 2 years ago
- A unified benchmark for math reasoning☆89Jan 25, 2023Updated 3 years ago
- NeqLIPS: a powerful Olympiad-level inequality prover☆39Sep 7, 2025Updated 5 months ago
- Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".☆165Dec 27, 2023Updated 2 years ago
- Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" [ICLR 2024]☆383Aug 25, 2024Updated last year
- AI for Mathematics Paper List☆17Jan 14, 2025Updated last year
- [ICML 2025] Official repository for paper "OR-Bench: An Over-Refusal Benchmark for Large Language Models"☆21Mar 4, 2025Updated 11 months ago
- Preparing for ML Interviews.☆53Jan 12, 2026Updated last month
- ☆22Apr 12, 2022Updated 3 years ago
- ☆26May 30, 2023Updated 2 years ago
- ☆25Aug 23, 2024Updated last year
- This is the repository for paper "CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models"☆29Oct 8, 2023Updated 2 years ago
- ☆31Sep 4, 2021Updated 4 years ago
- Mix of Minimal Optimal Sets (MMOS) of dataset has two advantages for two aspects, higher performance and lower construction costs on math…☆74Jul 27, 2024Updated last year
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆98Apr 26, 2023Updated 2 years ago
- Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".☆1,139Dec 23, 2023Updated 2 years ago
- [NeurlPS D&B 2024] Generative AI for Math: MathPile☆419Apr 4, 2025Updated 10 months ago
- [ICLR 2025] Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist☆35Oct 23, 2024Updated last year
- Benchmarking large language models' complex reasoning ability with chain-of-thought prompting☆2,768Aug 4, 2024Updated last year
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Mar 6, 2023Updated 2 years ago
- Code for ACL2023 paper: Pre-Training to Learn in Context☆106Jul 26, 2024Updated last year
- [TMLR] Cumulative Reasoning With Large Language Models (https://arxiv.org/abs/2308.04371)☆308Aug 2, 2025Updated 6 months ago
- ML Benchmarks in Algebraic Combinatorics☆23Jan 15, 2026Updated 3 weeks ago
- A RL env with procedurally generated symbolic reasoning data☆33Feb 3, 2026Updated last week
- Source codes and datasets for How well do Large Language Models perform in Arithmetic tasks?☆57Apr 17, 2023Updated 2 years ago