Khan / tutoring-accuracy-dataset
This repository hosts the paper “LLM Based Math Tutoring: Challenges and Dataset”, along with the accompanying dataset. It explores the performance and challenges of Large Language Models (LLMs) in math tutoring scenarios, providing a benchmark dataset for evaluating LLM accuracy in educational contexts.
☆39Updated 5 months ago
Alternatives and similar repositories for tutoring-accuracy-dataset:
Users that are interested in tutoring-accuracy-dataset are comparing it to the libraries listed below
- Edu-ConvoKit: An Open-Source Framework for Education Conversation Data☆82Updated 6 months ago
- ☆32Updated last year
- NAACL 2024. Code & Dataset for "🌁 Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistake…☆32Updated 7 months ago
- 🧮 MathDial: A Dialog Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems, EMNLP Findings 2023☆47Updated 11 months ago
- The repository contains the code and dataset for the Socratic Debugging task which is a novel task for Socratically Questioning Novice De…☆15Updated 10 months ago
- Bayesian IRT models in Python☆133Updated last month
- ☆12Updated last week
- ☆90Updated 8 months ago
- Open Source Intelligent Tutoring System w/ BKT (ReactJS and Firebase)☆104Updated this week
- The TalkMoves Dataset: K-12 mathematics lesson transcripts annotated for teacher and student discursive moves☆27Updated 3 years ago
- ☆10Updated last year
- Code and data for the paper "Measuring Conversational Uptake: A Case-Study on Student-Teacher Interactions"☆24Updated 2 years ago
- ☆104Updated 9 months ago
- Code to compute AnthroScore, a computational linguistic measure of anthropomorphism in text☆10Updated 4 months ago
- A corpus and code for understanding norms and subjectivity. 🤖☆47Updated 4 months ago
- Computerized Adaptive Testing☆49Updated 6 months ago
- Code/data for MARG (multi-agent review generation)☆38Updated 3 months ago
- Code for "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Mod…☆33Updated 11 months ago
- Package to extract connotation frames☆83Updated last year
- ☆21Updated 3 years ago
- This repository contains a dataset containing ≈2K dialogues whose listener utterances are annotated from labels derived from the Motiva-…☆14Updated 2 years ago
- Code for "Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies"☆30Updated 10 months ago
- NAEP Math Assessment Item Score Prediction Challenge (Spring 2023)☆14Updated last year
- An Item Response Theory Package for Python☆113Updated 2 years ago
- A Computational Framework for Behavioral Assessment of LLM Therapists☆25Updated 4 months ago
- ☆33Updated 4 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆125Updated 11 months ago
- Detecting Bias and ensuring Fairness in AI solutions☆87Updated 2 years ago
- ☆65Updated 10 months ago
- FrameBERT: Conceptual Metaphor Detection with Frame Embedding Learning. Presented at EACL 2023.☆25Updated last year