eth-lre / mathtutorbenchLinks
Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors
☆14Updated last month
Alternatives and similar repositories for mathtutorbench
Users that are interested in mathtutorbench are comparing it to the libraries listed below
Sorting:
- 🧮 MathDial: A Dialog Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems, EMNLP Findings 2023☆55Updated 3 months ago
- NAACL 2024. Code & Dataset for "🌁 Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistake…☆41Updated 11 months ago
- This repository hosts the paper “LLM Based Math Tutoring: Challenges and Dataset”, along with the accompanying dataset. It explores the p…☆47Updated 9 months ago
- Codes for papers on Large Language Models Personalization (LaMP)☆163Updated 4 months ago
- Awesome LLM for NLG Evaluation Papers☆24Updated last year
- Code and data for Marked Personas (ACL 2023)☆26Updated 2 years ago
- ☆95Updated 8 months ago
- ☆37Updated 8 months ago
- ☆36Updated 2 years ago
- This is a repository for sharing papers in the field of persona-based conversational AI. The related source code for each paper is linked…☆162Updated 11 months ago
- ☆67Updated last year
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆69Updated last year
- Awesome papers for role-playing with language models☆193Updated 7 months ago
- ☆75Updated 6 months ago
- ☆73Updated last year
- ☆17Updated 5 months ago
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆144Updated last month
- Critique-out-Loud Reward Models☆66Updated 8 months ago
- Multilingual Large Language Models Evaluation Benchmark☆124Updated 10 months ago
- ☆106Updated last year
- ☆36Updated 5 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆112Updated last year
- paper list on reasoning in NLP☆190Updated 2 months ago
- RecAlpaca: A simple framework combing Alpaca and Recommendations.☆34Updated 2 years ago
- Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models☆99Updated 10 months ago
- An Evaluation Taxonomy for Pedagogical Ability Assessment of LLM-Powered AI Tutors☆12Updated 2 weeks ago
- The Prism Alignment Project☆77Updated last year
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆114Updated 11 months ago
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆158Updated last month
- Codes for our paper "ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate"☆285Updated 8 months ago