Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors, EMNLP 2025 Oral
โ36Nov 18, 2025Updated 6 months ago
Alternatives and similar repositories for mathtutorbench
Users that are interested in mathtutorbench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ๐งฎ MathDial: A Dialog Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems, EMNLP Findings 2023โ82Sep 17, 2025Updated 8 months ago
- โ40Feb 4, 2026Updated 4 months ago
- โ24Jul 6, 2021Updated 4 years ago
- NAACL 2024. Code & Dataset for "๐ Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistakeโฆโ45Jul 21, 2024Updated last year
- FlexEval is an LLM evaluation tool designed for practical quantitative analysis.โ16Updated this week
- Wordpress hosting with auto-scaling - Free Trial Offer โข AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official code and data repository of MathChat: MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Inteโฆโ22Jun 3, 2024Updated 2 years ago
- MathPrompter Implementation: This repository hosts an implementation based on the 'MathPrompter: Mathematical Reasoning Using Large Languโฆโ15Apr 12, 2025Updated last year
- This repository hosts the paper โLLM Based Math Tutoring: Challenges and Datasetโ, along with the accompanying dataset. It explores the pโฆโ57Aug 29, 2024Updated last year
- Generating Teacher Student Interactions from Textbooks for Cost-Effective Development of Educational Chatbots, ACL 2024 Findingsโ13Mar 27, 2025Updated last year
- Code for the paper "Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs" at LAK2025.โ34Feb 12, 2025Updated last year
- โ26Dec 15, 2024Updated last year
- BERT score for text generationโ12Jan 15, 2025Updated last year
- โ23Mar 31, 2023Updated 3 years ago
- โ15Jan 2, 2022Updated 4 years ago
- Simple, predictable pricing with DigitalOcean hosting โข AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- A collection of R packages for educational dataminingโ15Jan 14, 2019Updated 7 years ago
- [ICLR 2025] Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklistโ34Oct 23, 2024Updated last year
- โ26Apr 18, 2020Updated 6 years ago
- โ27Jun 2, 2026Updated last week
- Data on international first names and sex of people with that nameโ13Jan 12, 2019Updated 7 years ago
- A pipeline for the automatic construction of geometry problems along with step-by-step solutions.โ17Aug 27, 2025Updated 9 months ago
- CEduMEval : A Chinese educational multi-task evaluation benchmarkโ17Nov 18, 2024Updated last year
- ์ฌ์ ์์ ๋ํ ์๋ฌธ๋ง ์ถ์ถํ ๋ฐ์ดํฐโ16Apr 24, 2023Updated 3 years ago
- Command-line utility for fitting Hidden Markov Models at scaleโ58Apr 14, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways โข AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official repository for "EduBench: A Comprehensive Benchmarking Dataset for Evaluating Large Language Models in Diverse Educational Scenaโฆโ24May 28, 2025Updated last year
- Compare geographic featuresโ15May 23, 2023Updated 3 years ago
- Serving Example of CodeGen-350M-Mono-GPTJ on Triton Inference Server with Docker and Kubernetesโ20May 30, 2023Updated 3 years ago
- Exploring classifier-free guidance in a DDPM language model for text generation towards emotion targets.โ11Sep 7, 2025Updated 9 months ago
- https://openreview.net/forum?id=OC1o4_OI6Jwโ13May 27, 2022Updated 4 years ago
- โ36May 24, 2025Updated last year
- Tutorial for AutoGen with Amadeus flights API, Parallel Function Callsโ48Apr 6, 2024Updated 2 years ago
- Make a DocumentTermMatrix fasterโ22Oct 24, 2023Updated 2 years ago
- Memory-optimized training scripts for video models based on Diffusersโ16Jan 3, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer โข AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.โ17Dec 14, 2021Updated 4 years ago
- Monotonic Attention based ConvBERT for Knowledge Tracingโ15Sep 14, 2022Updated 3 years ago
- Toolbox for people detecting, tracking, and re-identifying.โ18May 26, 2026Updated 2 weeks ago
- the English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpusโ27Nov 26, 2025Updated 6 months ago
- Dataset of conversations, generated by prompting Gemini Ultra. These are conversations between a teacher and a student, where the teacherโฆโ36Oct 29, 2024Updated last year
- โ19Sep 20, 2022Updated 3 years ago
- [CIKM 2022]Contrastive Cross-Domain Sequential Recommendationโ38Dec 7, 2022Updated 3 years ago