causalNLP / cladder
We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.
☆115Updated 10 months ago
Alternatives and similar repositories for cladder:
Users that are interested in cladder are comparing it to the libraries listed below
- Data and code for the Corr2Cause paper (ICLR 2024)☆96Updated last year
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆103Updated last year
- ☆82Updated 8 months ago
- ☆68Updated last year
- The Prism Alignment Project☆73Updated 11 months ago
- Inspecting and Editing Knowledge Representations in Language Models☆115Updated last year
- Code/data for MARG (multi-agent review generation)☆42Updated 5 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆74Updated last year
- ☆114Updated 8 months ago
- Solving the causality pairs challenge (does A cause B) with ChatGPT☆76Updated 10 months ago
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆105Updated last year
- PASTA: Post-hoc Attention Steering for LLMs☆113Updated 4 months ago
- Repo for: When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment☆38Updated last year
- ☆90Updated 9 months ago
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆77Updated last year
- Code and Data for "MIRAI: Evaluating LLM Agents for Event Forecasting"☆62Updated 9 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆84Updated 3 weeks ago
- ☆132Updated 5 months ago
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆59Updated last year
- ☆17Updated last year
- ☆48Updated last year
- Extending Conformal Prediction to LLMs☆66Updated 9 months ago
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆158Updated 11 months ago
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆35Updated 5 months ago
- Generating diverse counterfactual data for Natural Language Understanding tasks using Large Language Models (LLMs). The generator support…☆36Updated last year
- Conformal Language Modeling☆28Updated last year
- ☆90Updated 2 months ago
- The data for the CRASS-benchmark☆16Updated 2 years ago
- A mechanistic approach for understanding and detecting factual errors of large language models.☆43Updated 9 months ago
- ☆50Updated last week