Khan / tutoring-accuracy-datasetLinks
This repository hosts the paper “LLM Based Math Tutoring: Challenges and Dataset”, along with the accompanying dataset. It explores the performance and challenges of Large Language Models (LLMs) in math tutoring scenarios, providing a benchmark dataset for evaluating LLM accuracy in educational contexts.
☆47Updated 9 months ago
Alternatives and similar repositories for tutoring-accuracy-dataset
Users that are interested in tutoring-accuracy-dataset are comparing it to the libraries listed below
Sorting:
- Edu-ConvoKit: An Open-Source Framework for Education Conversation Data☆95Updated 2 months ago
- 🧮 MathDial: A Dialog Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems, EMNLP Findings 2023☆55Updated 3 months ago
- Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors☆14Updated last month
- Code for the paper "Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs" at LAK2025.☆16Updated 4 months ago
- NAACL 2024. Code & Dataset for "🌁 Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistake…☆41Updated 11 months ago
- ☆33Updated 2 years ago
- ☆22Updated 3 years ago
- ☆95Updated last year
- Codes and Datasets for our ACL 2023 paper on cognitive reframing of negative thoughts☆63Updated last year
- This is the data associated with the PERSUADE Corpus 2.0 version☆43Updated 7 months ago
- The Prism Alignment Project☆77Updated last year
- An Evaluation Taxonomy for Pedagogical Ability Assessment of LLM-Powered AI Tutors☆12Updated 2 weeks ago
- Official repository for the AnnoMI dataset: the first public collection of expert-annotated MI transcripts.☆71Updated 2 years ago
- Computerized Adaptive Testing☆61Updated 10 months ago
- Code and data for the paper "Measuring Conversational Uptake: A Case-Study on Student-Teacher Interactions"☆24Updated 2 months ago
- ☆12Updated last week
- This repository contains a dataset containing ≈2K dialogues whose listener utterances are annotated from labels derived from the Motiva-…☆17Updated 2 years ago
- [NeurIPS 2023] Codebase for the paper: "Guiding Large Language Models with Directional Stimulus Prompting"☆111Updated 2 years ago
- The repository contains the code and dataset for the Socratic Debugging task which is a novel task for Socratically Questioning Novice De…☆18Updated last year
- Bayesian IRT models in Python☆142Updated this week
- A Computational Framework for Behavioral Assessment of LLM Therapists☆29Updated 8 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆130Updated last year
- The TalkMoves Dataset: K-12 mathematics lesson transcripts annotated for teacher and student discursive moves☆29Updated 3 years ago
- This repository contains the data and code introduced in the paper "CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Maske…☆120Updated last year
- Code for "Question Generation for Adaptive Education", to appear at ACL 2021.☆33Updated 3 years ago
- ☆26Updated last year
- Data for evaluating gender bias in coreference resolution systems.☆77Updated 6 years ago
- Multilingual Large Language Models Evaluation Benchmark☆124Updated 10 months ago
- ☆24Updated 2 years ago
- A dataset of over 10000 question and answer pairs written for storybooks.☆40Updated 2 years ago