TIGER-AI-Lab/TheoremQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TIGER-AI-Lab/TheoremQA)

TIGER-AI-Lab / TheoremQA

The official repo for "TheoremQA: A Theorem-driven Question Answering dataset" (EMNLP 2023)

☆40

Alternatives and similar repositories for TheoremQA

Users that are interested in TheoremQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

wenhuchen / TheoremQA
View on GitHub
The dataset and code for paper: TheoremQA: A Theorem-driven Question Answering dataset
☆161Apr 23, 2024Updated 2 years ago
CLUEbenchmark / SuperCLUE-Code3
View on GitHub
中文原生等级化代码能力测试基准
☆15Apr 11, 2024Updated 2 years ago
tengxiaoliu / XoT
View on GitHub
[EMNLP 2023] Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts
☆27Nov 4, 2023Updated 2 years ago
allenai / numglue
View on GitHub
NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks
☆20May 10, 2022Updated 4 years ago
TIGER-AI-Lab / MAmmoTH
View on GitHub
Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" [ICLR 2024]
☆383Aug 25, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
TIGER-AI-Lab / VideoEval-Pro
View on GitHub
VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation [TMLR26]
☆15Jun 1, 2026Updated last month
oashua / MathAgent
View on GitHub
Code repo for MathAgent
☆20Dec 15, 2023Updated 2 years ago
haoyuzhao123 / LeanIneqComp
View on GitHub
An inequality benchmark for theorem proving
☆22Feb 1, 2026Updated 5 months ago
mikejqzhang / SituatedQA
View on GitHub
☆23Aug 10, 2022Updated 3 years ago
mandyyyyii / scibench
View on GitHub
☆132Jul 8, 2024Updated 2 years ago
yeahrmek / pylean
View on GitHub
Python wrapper for lean-gym
☆13Apr 5, 2023Updated 3 years ago
friederrr / GHOSTS
View on GitHub
GHOSTS dataset
☆39Jul 19, 2023Updated 3 years ago
arubique / OCCAM
View on GitHub
This is an implementation of the paper "Are We Done with Object-Centric Learning?"
☆13Jun 21, 2026Updated last month
Riccorl / chinese-word-segmentation-pytorch
View on GitHub
Chinese Word Segmentation task based on BERT and implemented in Pytorch
☆14Aug 14, 2020Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
infi-coder / infibench-evaluation-harness
View on GitHub
The Infibench variant of bigcode-evaluation-harness --- a framework for the evaluation of autoregressive code generation language models.
☆14Oct 19, 2024Updated last year
RUCAIBox / JiuZhang3.0
View on GitHub
The code and data for the paper JiuZhang3.0
☆49May 26, 2024Updated 2 years ago
OpenBMB / OlympiadBench
View on GitHub
[ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scie…
☆195Jun 8, 2025Updated last year
TIGER-AI-Lab / ImagenWorld
View on GitHub
Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks [ICLR 2026]
☆32Apr 2, 2026Updated 3 months ago
mrc03 / Cats-vs-Dogs-CNN-Keras
View on GitHub
The famous Cats-vs-Dogs dataset. I have used a self laid ConvNet to classify the image into 2 classes either a Dog or a Cat. The images u…
☆11Aug 24, 2018Updated 7 years ago
MinkaiXu / AliDiff
View on GitHub
NeurIPS24: Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization
☆43Apr 2, 2025Updated last year
trishullab / itp-interface
View on GitHub
Generic interface for hooking up to any Interactive Theorem Prover (ITP) and collecting data for training ML models for AI in formal theo…
☆19Jul 10, 2026Updated last week
wenhuchen / Time-Sensitive-QA
View on GitHub
Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"
☆77Mar 3, 2022Updated 4 years ago
nyu-mll / quality
View on GitHub
☆151Jan 17, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
iiis-ai / IterativeQuestionComposing
View on GitHub
[AAAI 2025] Augmenting Math Word Problems via Iterative Question Composing (https://arxiv.org/abs/2401.09003)
☆23Oct 2, 2025Updated 9 months ago
shizhediao / automate-cot
View on GitHub
Source code for the paper "Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data"
☆20Feb 24, 2024Updated 2 years ago
whyNLP / Conic10K
View on GitHub
Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.
☆33Dec 6, 2023Updated 2 years ago
TIGER-AI-Lab / AceCoder
View on GitHub
The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]
☆100Apr 9, 2025Updated last year
CGCL-codes / GraphInstruct
View on GitHub
The benchmark proposed in paper: GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability
☆25Aug 12, 2025Updated 11 months ago
jnwnlee / selva
View on GitHub
[CVPR 2026] Official PyTorch implementation of SelVA "Hear What Matters! Text-conditioned Selective Video-to-Audio Generation"
☆15Mar 27, 2026Updated 3 months ago
JIA-Lab-research / Step-DPO
View on GitHub
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
☆398Jan 19, 2025Updated last year
princeton-nlp / ELIZA-Transformer
View on GitHub
[NAACL 2025] Representing Rule-based Chatbots with Transformers
☆23Feb 9, 2025Updated last year
H-TayyarMadabushi / AStitchInLanguageModels
View on GitHub
Data and Baselines for AStitchInLanguageModels dataset
☆13Oct 31, 2022Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
nju-websoft / HuggingBench
View on GitHub
[SIGIR 2025] Benchmarking Recommendation, Classification, and Tracing Based on Hugging Face Knowledge Graph
☆16Jun 6, 2025Updated last year
svjack / CodeActAgent-Gradio
View on GitHub
UnOfficial Gradio Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Y…
☆16Sep 30, 2024Updated last year
liuchengwucn / FIMO
View on GitHub
☆38Jun 30, 2026Updated 3 weeks ago
AmenRa / a-multi-domain-benchmark-for-personalized-search-evaluation
View on GitHub
A Multi-domain Benchmark for Personalized Search Evaluation
☆12Sep 7, 2023Updated 2 years ago
kyegomez / TinyGPTV
View on GitHub
Simple Implementation of TinyGPTV in super simple Zeta lego blocks
☆16Nov 11, 2024Updated last year
LgQu / TIGeR
View on GitHub
Code for paper: Unified Text-to-Image Generation and Retrieval
☆16Updated this week
McGill-NLP / VinePPO
View on GitHub
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
☆192May 25, 2025Updated last year