Khan/tutoring-accuracy-dataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Khan/tutoring-accuracy-dataset)

Khan / tutoring-accuracy-dataset

This repository hosts the paper “LLM Based Math Tutoring: Challenges and Dataset”, along with the accompanying dataset. It explores the performance and challenges of Large Language Models (LLMs) in math tutoring scenarios, providing a benchmark dataset for evaluating LLM accuracy in educational contexts.

☆58

Alternatives and similar repositories for tutoring-accuracy-dataset

Users that are interested in tutoring-accuracy-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

umass-ml4ed / dialogue-kt
View on GitHub
Code for the paper "Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs" at LAK2025.
☆39Feb 12, 2025Updated last year
eth-nlped / mathdial
View on GitHub
🧮 MathDial: A Dialog Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems, EMNLP Findings 2023
☆89Sep 17, 2025Updated 10 months ago
rosewang2008 / bridge
View on GitHub
NAACL 2024. Code & Dataset for "🌁 Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistake…
☆46Jul 21, 2024Updated 2 years ago
kstats / CIMA
View on GitHub
☆24Jul 6, 2021Updated 5 years ago
ArgLab / writing_observer
View on GitHub
Writing Observer and Learning Observer: A system for monitoring learning process data, with an initial focus on writing process data from…
☆12Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
eth-lre / mathtutorbench
View on GitHub
Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors, EMNLP 2025 Oral
☆40Nov 18, 2025Updated 8 months ago
luffycodes / Tutorbot-Spock
View on GitHub
An Education Tutoring Chatbot based on Learning Science Principles powered by Large Language Models
☆56Nov 4, 2024Updated last year
eth-lre / PedagogicalRL
View on GitHub
Multi-turn RL framework for aligning models to be tutors instead of answerers. EMNLP 2025 Oral
☆42Dec 11, 2025Updated 7 months ago
EduNLP / edu-convokit
View on GitHub
Edu-ConvoKit: An Open-Source Framework for Education Conversation Data
☆115Apr 19, 2025Updated last year
LinxZhao / VizChat-pub
View on GitHub
☆11May 30, 2024Updated 2 years ago
gao-xiao-bai / JsonTuning
View on GitHub
JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning
☆10Nov 3, 2024Updated last year
DigitalHarborFoundation / FlexEval
View on GitHub
FlexEval is an LLM evaluation tool designed for practical quantitative analysis.
☆16Updated this week
SLCLADAL / SLCLADAL.github.io
View on GitHub
This is the website for the Language Technology and Data Analysis Laboratory (LADAL) which is part of the School of Languages and Culture…
☆14Mar 30, 2026Updated 3 months ago
princeton-nlp / LM-Science-Tutor
View on GitHub
☆50Aug 6, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
andrewpknight / zoomGroupStats
View on GitHub
R package that provides utilities for processing and analyzing the files that are exported from a recorded 'Zoom' Meeting. This includes …
☆16Apr 5, 2022Updated 4 years ago
iwangjian / Plan4RecDial
View on GitHub
Follow Me: Conversation Planning for Target-driven Recommendation Dialogue Systems
☆12Aug 1, 2023Updated 2 years ago
longwind48 / convo-miner
View on GitHub
Mine conversations from novels in Project Gutenberg, to generate data for data-driven dialogue systems.
☆15May 7, 2019Updated 7 years ago
laser-institute / network-analysis
View on GitHub
Social Network Analysis and STEM Education is designed to prepare researchers to apply network analysis in order to better understand and…
☆15Jul 14, 2025Updated last year
touretzkyds / DiffusionDemo
View on GitHub
In-depth exploration of stable diffusion models, walking readers through the inner workings in a step-by-step manner.
☆21Sep 8, 2025Updated 10 months ago
kateto / Computational_Social_Science_Course_R_Code
View on GitHub
This is part of the code used in my Computational Social Science doctoral seminar at Rugers Unviersity in 2023
☆13Jul 7, 2023Updated 3 years ago
VideoAnalysis / EDUVSUM
View on GitHub
EDUVSUM is a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features to identify important tempo…
☆23Mar 8, 2024Updated 2 years ago
ddemszky / classroom-transcript-analysis
View on GitHub
☆43Jun 15, 2026Updated last month
a-paxton / crqa-tools-and-more
View on GitHub
Tutorials and tools to help with RQA and CRQA in R.
☆15Feb 4, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
LCS2-IIITD / DaSLaM
View on GitHub
☆17Oct 31, 2023Updated 2 years ago
mlinyun / lingyun-online-judge
View on GitHub
基于 C++ 和 Vue.js 技术开发的 Online Judge 系统，能够编译执行代码，使用预设的数据对程序进行测试。项目采用前后端分离架构，基于模块开发，涉及到用户模块、题目模块、公告模块、讨论模块、题解模块、评论模块、测评记录模块和判题模块。
☆12May 13, 2026Updated 2 months ago
conceptmath / conceptmath
View on GitHub
[ACL 2024 Findings] The official repo for "ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large …
☆26May 29, 2024Updated 2 years ago
pykt-team / pykt-toolkit
View on GitHub
pyKT: A Python Library to Benchmark Deep Learning based Knowledge Tracing Models
☆421Updated this week
matinho13 / SentiArt
View on GitHub
A simple vector space model based tool for sentiment analysis of literary texts
☆19Sep 17, 2024Updated last year
commonstandardsproject / standards-importer
View on GitHub
A small ETL for importing data from the common standards project into a relational database
☆12Jul 17, 2018Updated 8 years ago
minjechoi / relationships
View on GitHub
Official repository for the ICWSM '21 paper "More than meets the tie: Examining the Role of Interpersonal Relationships in Social Network…
☆12Apr 26, 2023Updated 3 years ago
meghabyte / acl2021-education
View on GitHub
Code for "Question Generation for Adaptive Education", to appear at ACL 2021.
☆33Jul 18, 2021Updated 5 years ago
reiyw / pdf2sb
View on GitHub
View presentation slides in Scrapbox
☆15Jun 5, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
thusns / thu-wiki
View on GitHub
A wiki platform for the students and teachers of Tsinghua University
☆15Jun 25, 2026Updated last month
tomasengelthaler / HumorNorms
View on GitHub
Normative rating dataset of single-word humor.
☆15Jul 26, 2017Updated 9 years ago
Pillars-Creation / Visualglm-image-to-text
View on GitHub
补充了一些Visualglm缺少的文件，可以对Visualglm进行训练，实例中是对人脸做了面相的识别
☆13Jun 7, 2023Updated 3 years ago
karinseve / OTTers
View on GitHub
☆19Jun 7, 2021Updated 5 years ago
koheiw / wordvector
View on GitHub
Train word and document vectors using quanteda
☆16Updated this week
exoskeletonzj / MARS
View on GitHub
A Multi-Agent Approach Integrating Socratic Guidance for Automated Prompt Optimization
☆18Dec 15, 2025Updated 7 months ago
zepingyu0512 / in-context-mechanism
View on GitHub
code for EMNLP 2024 paper: How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for M…
☆13Nov 17, 2024Updated last year