Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors, EMNLP 2025 Oral
☆35Nov 18, 2025Updated 5 months ago
Alternatives and similar repositories for mathtutorbench
Users that are interested in mathtutorbench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multi-turn RL framework for aligning models to be tutors instead of answerers. EMNLP 2025 Oral☆34Dec 11, 2025Updated 4 months ago
- 🧮 MathDial: A Dialog Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems, EMNLP Findings 2023☆80Sep 17, 2025Updated 7 months ago
- An Evaluation Taxonomy for Pedagogical Ability Assessment of LLM-Powered AI Tutors☆27Mar 2, 2026Updated 2 months ago
- ☆38Feb 4, 2026Updated 2 months ago
- ☆24Jul 6, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- NAACL 2024. Code & Dataset for "🌁 Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistake…☆45Jul 21, 2024Updated last year
- FlexEval is an LLM evaluation tool designed for practical quantitative analysis.☆16Updated this week
- Official code and data repository of MathChat: MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Inte…☆22Jun 3, 2024Updated last year
- MathPrompter Implementation: This repository hosts an implementation based on the 'MathPrompter: Mathematical Reasoning Using Large Langu…☆15Apr 12, 2025Updated last year
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆33Dec 24, 2025Updated 4 months ago
- This repository hosts the paper “LLM Based Math Tutoring: Challenges and Dataset”, along with the accompanying dataset. It explores the p…☆57Aug 29, 2024Updated last year
- An implementation of the Latent Skill Embedding model☆10Feb 19, 2016Updated 10 years ago
- Askalot CQA System of Next Generation☆26Dec 14, 2022Updated 3 years ago
- Code for the paper "Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs" at LAK2025.☆32Feb 12, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Repo for paper: https://arxiv.org/abs/2404.06479☆30Oct 3, 2024Updated last year
- ☆20Apr 16, 2025Updated last year
- A simple Dataset generator for Moving Mnist☆14May 26, 2023Updated 2 years ago
- BERT score for text generation☆12Jan 15, 2025Updated last year
- This is an official implementation of the paper ``Building Math Agents with Multi-Turn Iterative Preference Learning'' with multi-turn DP…☆33Dec 5, 2024Updated last year
- ☆23Mar 31, 2023Updated 3 years ago
- sigir2019_π-Net: A Parallel Information-sharing Network for Shared account Cross-domain Sequential Recommendations☆22Jun 7, 2021Updated 4 years ago
- A collection of R packages for educational datamining☆15Jan 14, 2019Updated 7 years ago
- Because you're computing conversion rates wrong☆16May 23, 2017Updated 8 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆26Apr 18, 2020Updated 6 years ago
- Ross extension to Chernoff faces☆11May 21, 2018Updated 7 years ago
- ☆28Nov 10, 2025Updated 5 months ago
- Data, code, and images for a posting summarizing three studies about pie charts☆15Jul 12, 2016Updated 9 years ago
- Data on international first names and sex of people with that name☆13Jan 12, 2019Updated 7 years ago
- A pipeline for the automatic construction of geometry problems along with step-by-step solutions.☆17Aug 27, 2025Updated 8 months ago
- CEduMEval : A Chinese educational multi-task evaluation benchmark☆17Nov 18, 2024Updated last year
- 사전에서 대화 예문만 추출한 데이터☆16Apr 24, 2023Updated 3 years ago
- A web front end for R Twitter sentiment analysis☆18Sep 2, 2011Updated 14 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Automatic Generation of Scaffolding Questions for Learning Math, EMNLP 2022. RL, REINFORCE☆26Jun 30, 2023Updated 2 years ago
- Race and Ethnicity based on name using data from census, voter reg. files, etc.☆11Jan 17, 2018Updated 8 years ago
- Fitting stochastic blockmodels to graphs☆17Jul 8, 2016Updated 9 years ago
- Beyond LM: How can language model go forward in the future?☆15Apr 30, 2023Updated 3 years ago
- A method for estimating causal effects in time-series data. Uses available data to automatically find natural experiments for identifying…☆17Dec 16, 2019Updated 6 years ago
- Official repository for "EduBench: A Comprehensive Benchmarking Dataset for Evaluating Large Language Models in Diverse Educational Scena…☆21May 28, 2025Updated 11 months ago
- 1-day R workshop for experienced users. First session covers script writing, file names and overall project structures. The second sessio…☆13Jan 25, 2019Updated 7 years ago