CEBaBing / CEBaBLinks

CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior

☆12

Alternatives and similar repositories for CEBaB

Users that are interested in CEBaB are comparing it to the libraries listed below

Sorting:

tatsu-lab / opinions_qa
☆108Updated last year
tsor13 / kaleido
☆22Updated last year
Betswish / MIRAGE
Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/
☆24Updated 4 months ago
pliang279 / sent_debias
[ACL 2020] Towards Debiasing Sentence Representations
☆66Updated 2 years ago
pliang279 / LM_bias
[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models
☆61Updated 2 years ago
balevinstein / Probes
☆51Updated 2 years ago
abaheti95 / ToxiChat
Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming so…
☆17Updated last year
jkallini / mission-impossible-language-models
Code repository for the paper "Mission: Impossible Language Models."
☆52Updated 2 months ago
aaronmueller / MIB
Landing page for MIB: A Mechanistic Interpretability Benchmark
☆16Updated last week
google / belief-localization
This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…
☆61Updated 2 years ago
DiLi-Lab / ScanDL
☆14Updated 2 months ago
yanaiela / pararel
☆45Updated last year
allenai / few_shot_explanations
Code for NAACL 2022 paper "Reframing Human-AI Collaboration for Generating Free-Text Explanations"
☆31Updated 2 years ago
feradauto / MoralCoT
Repo for: When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment
☆38Updated 2 years ago
sylinrl / CalibratedMath
Teaching Models to Express Their Uncertainty in Words
☆39Updated 3 years ago
faridlazuarda / cultural-llm-papers
A curated list of research papers and resources on Cultural LLM.
☆45Updated 9 months ago
princeton-nlp / MABEL
EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975
☆38Updated last year
SALT-NLP / mic
Data and code for the paper "The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems"
☆19Updated last year
ruiqi-zhong / DescribeDistributionalDifferences
Code for preprint: Summarizing Differences between Text Distributions with Natural Language
☆42Updated 2 years ago
BunsenFeng / PoliLean
Code for "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Mod…
☆37Updated last year
allenai / mice
☆26Updated 2 years ago
casszhao / PruneHall
Codebase, data and models for hallucination of pruned models
☆16Updated 6 months ago
Jiaxin-Pei / Potato-Prolific-Dataset
☆16Updated 2 years ago
tatsu-lab / linguistic_calibration
Align your LM to express calibrated verbal statements of confidence in its long-form generations.
☆26Updated last year
qkaren / COLD_decoding
☆108Updated 3 years ago
BrachioLab / incontext_influences
In-context Example Selection with Influences
☆15Updated 2 years ago
gsarti / pecore
Materials for "Quantifying the Plausibility of Context Reliance in Neural Machine Translation" at ICLR'24 🐑 🐑
☆15Updated last year
peterbhase / LAS-NL-Explanations
Code for paper "Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?"
☆22Updated 4 years ago
evandez / REMEDI
Inspecting and Editing Knowledge Representations in Language Models
☆116Updated last year
zouharvi / ryanize-bib
Highlight errors in a bib file: missing URLs, capitalization protection, etc
☆27Updated last year