Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors, EMNLP 2025 Oral
☆35Nov 18, 2025Updated 6 months ago
Alternatives and similar repositories for mathtutorbench
Users that are interested in mathtutorbench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multi-turn RL framework for aligning models to be tutors instead of answerers. EMNLP 2025 Oral☆36Dec 11, 2025Updated 5 months ago
- ☆39Feb 4, 2026Updated 3 months ago
- NAACL 2024. Code & Dataset for "🌁 Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistake…☆45Jul 21, 2024Updated last year
- FlexEval is an LLM evaluation tool designed for practical quantitative analysis.☆16May 12, 2026Updated last week
- Official code and data repository of MathChat: MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Inte…☆22Jun 3, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- MathPrompter Implementation: This repository hosts an implementation based on the 'MathPrompter: Mathematical Reasoning Using Large Langu…☆15Apr 12, 2025Updated last year
- OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, …☆19Jun 25, 2024Updated last year
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆34Dec 24, 2025Updated 4 months ago
- This repository hosts the paper “LLM Based Math Tutoring: Challenges and Dataset”, along with the accompanying dataset. It explores the p…☆57Aug 29, 2024Updated last year
- Generating Teacher Student Interactions from Textbooks for Cost-Effective Development of Educational Chatbots, ACL 2024 Findings☆13Mar 27, 2025Updated last year
- Askalot CQA System of Next Generation☆26Dec 14, 2022Updated 3 years ago
- Code for the paper "Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs" at LAK2025.☆33Feb 12, 2025Updated last year
- development moved to https://github.com/myudelson/hmm-scalable☆38May 7, 2019Updated 7 years ago
- Repo for paper: https://arxiv.org/abs/2404.06479☆30Oct 3, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆10Jan 30, 2017Updated 9 years ago
- BERT score for text generation☆12Jan 15, 2025Updated last year
- This is an official implementation of the paper ``Building Math Agents with Multi-Turn Iterative Preference Learning'' with multi-turn DP…☆33Dec 5, 2024Updated last year
- ☆23Mar 31, 2023Updated 3 years ago
- sigir2019_π-Net: A Parallel Information-sharing Network for Shared account Cross-domain Sequential Recommendations☆22Jun 7, 2021Updated 4 years ago
- Djinn-Agent: A lightweight CLI tool for seamless interaction with Claude's advanced computer-use capabilities, automating complex tasks f…☆27Oct 28, 2024Updated last year
- ☆15Jan 2, 2022Updated 4 years ago
- ☆23Aug 1, 2022Updated 3 years ago
- A collection of R packages for educational datamining☆15Jan 14, 2019Updated 7 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ICLR 2025] Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist☆34Oct 23, 2024Updated last year
- Data, code, and images for a posting summarizing three studies about pie charts☆15Jul 12, 2016Updated 9 years ago
- Data on international first names and sex of people with that name☆13Jan 12, 2019Updated 7 years ago
- ☆29Jun 5, 2025Updated 11 months ago
- CEduMEval : A Chinese educational multi-task evaluation benchmark☆17Nov 18, 2024Updated last year
- 사전에서 대화 예문만 추출한 데이터☆16Apr 24, 2023Updated 3 years ago
- A web front end for R Twitter sentiment analysis☆18Sep 2, 2011Updated 14 years ago
- Automatic Generation of Scaffolding Questions for Learning Math, EMNLP 2022. RL, REINFORCE☆26Jun 30, 2023Updated 2 years ago
- Remove unwanted LaTeX commands and their associated closing brackets☆11Jul 22, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Command-line utility for fitting Hidden Markov Models at scale☆58Apr 14, 2026Updated last month
- numpy实现常用的的机器学习库,分类模型实现:KNN,LDA,LR,Decision Tree(ID3,C4.5,CART),RF,perception,SVM,Neural network,GBDT,Xgboost,Adaboost;回归模型实现 :LASSO,Ridg…☆24Feb 19, 2022Updated 4 years ago
- Beyond LM: How can language model go forward in the future?☆15Apr 30, 2023Updated 3 years ago
- A method for estimating causal effects in time-series data. Uses available data to automatically find natural experiments for identifying…☆17Dec 16, 2019Updated 6 years ago
- Unix stream tool using for Javascript and JSON☆16Feb 26, 2011Updated 15 years ago
- Official repository for "EduBench: A Comprehensive Benchmarking Dataset for Evaluating Large Language Models in Diverse Educational Scena…☆21May 28, 2025Updated 11 months ago
- Compare geographic features☆15May 23, 2023Updated 2 years ago