taisazero / socratic-debugging-benchmarkLinks
The repository contains the code and dataset for the Socratic Debugging task which is a novel task for Socratically Questioning Novice Debuggers to guide them towards discovering and fixing a buggy python program.
☆19Updated last year
Alternatives and similar repositories for socratic-debugging-benchmark
Users that are interested in socratic-debugging-benchmark are comparing it to the libraries listed below
Sorting:
- NAACL 2024. Code & Dataset for "🌁 Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistake…☆45Updated last year
- Inspecting and Editing Knowledge Representations in Language Models☆119Updated 2 years ago
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆165Updated last year
- Source code for the paper "Active Prompting with Chain-of-Thought for Large Language Models"☆248Updated last year
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆99Updated 2 years ago
- 🧮 MathDial: A Dialog Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems, EMNLP Findings 2023☆71Updated 3 months ago
- ☆100Updated last year
- Code and data associated with the AmbiEnt dataset in "We're Afraid Language Models Aren't Modeling Ambiguity" (Liu et al., 2023)☆64Updated last year
- An Evaluation Taxonomy for Pedagogical Ability Assessment of LLM-Powered AI Tutors☆24Updated this week
- [ICML 2023] Code for our paper “Compositional Exemplars for In-context Learning”.☆102Updated 2 years ago
- [ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"☆110Updated 2 years ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆86Updated last year
- ☆47Updated last year
- Data and code for the paper "Inducing Positive Perspectives with Text Reframing"☆61Updated 2 years ago
- Token-level Reference-free Hallucination Detection☆97Updated 2 years ago
- Implementation of the paper: "Answering Questions by Meta-Reasoning over Multiple Chains of Thought"☆96Updated last year
- ☆50Updated last year
- Codes and Datasets for our ACL 2023 paper on cognitive reframing of negative thoughts☆66Updated 2 years ago
- Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"☆56Updated 2 years ago
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆61Updated 10 months ago
- ACL2023 - AlignScore, a metric for factual consistency evaluation.☆147Updated last year
- ☆116Updated last year
- A package to generate summaries of long-form text and evaluate the coherence of these summaries. Official package for our ICLR 2024 paper…☆128Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆136Updated last year
- The Prism Alignment Project☆87Updated last year
- Repository for Decomposed Prompting☆95Updated 2 years ago
- RARR: Researching and Revising What Language Models Say, Using Language Models☆49Updated 2 years ago
- Repository for research in the field of Responsible NLP at Meta.☆204Updated 7 months ago
- ☆50Updated 2 years ago
- Language Models of Code are Few-Shot Commonsense Learners (EMNLP 2022)☆86Updated 2 years ago