rosewang2008 / bridge
NAACL 2024. Code & Dataset for "๐ Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistakes"
โ36Updated 9 months ago
Alternatives and similar repositories for bridge:
Users that are interested in bridge are comparing it to the libraries listed below
- Edu-ConvoKit: An Open-Source Framework for Education Conversation Dataโ92Updated 8 months ago
- โ68Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answersโ126Updated last year
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptionsโ69Updated 2 years ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"โ83Updated 8 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"โ74Updated last year
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)โ36Updated 3 months ago
- The repository contains the code and dataset for the Socratic Debugging task which is a novel task for Socratically Questioning Novice Deโฆโ17Updated last year
- Learning to route instances for Human vs AI Feedbackโ23Updated 2 months ago
- Functional Benchmarks and the Reasoning Gapโ85Updated 6 months ago
- ๐งฎ MathDial: A Dialog Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems, EMNLP Findings 2023โ52Updated last month
- Code accompanying "How I learned to start worrying about prompt formatting".โ104Updated 6 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"โ54Updated last year
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding forโฆโ25Updated 4 months ago
- โ33Updated 2 years ago
- โ35Updated 6 months ago
- CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environmentsโ51Updated last month
- Codebase accompanying the Summary of a Haystack paper.โ77Updated 7 months ago
- A set of utilities for running few-shot prompting experiments on large-language modelsโ118Updated last year
- The Prism Alignment Projectโ73Updated 11 months ago
- An Evaluation Taxonomy for Pedagogical Ability Assessment of LLM-Powered AI Tutorsโ8Updated last week
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"โ68Updated 10 months ago
- โ106Updated 11 months ago
- โ120Updated 6 months ago
- โ93Updated 10 months ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Searchโ80Updated 4 months ago
- An attribution library for LLMsโ38Updated 7 months ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).โ80Updated last year
- โ39Updated 2 years ago
- โ21Updated 10 months ago