Heidelberg-NLP/CC-SHAP

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Heidelberg-NLP/CC-SHAP)

Heidelberg-NLP / CC-SHAP

Code for "On Measuring Faithfulness of Natural Language Explanations"

☆23

Alternatives and similar repositories for CC-SHAP

Users that are interested in CC-SHAP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Heidelberg-NLP / CC-SHAP-VLM
View on GitHub
Official code implementation for the paper "Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Expl…
☆12Jul 14, 2026Updated last week
DiLi-Lab / ScanDL
View on GitHub
☆14Apr 29, 2025Updated last year
Heidelberg-NLP / MM-SHAP
View on GitHub
This is the official implementation of the paper "MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision…
☆32Jul 14, 2026Updated last week
technion-cs-nlp / parametric-faithfulness
View on GitHub
☆23Aug 30, 2025Updated 10 months ago
Betswish / MIRAGE
View on GitHub
Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/
☆25Mar 10, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
FarnoushRJ / RelP
View on GitHub
[NeurIPS 2025 MechInterp Workshop - Spotlight] Official implementation of the paper "RelP: Faithful and Efficient Circuit Discovery in La…
☆29Nov 3, 2025Updated 8 months ago
yoavgur / PISCES
View on GitHub
🪝PISCES - Precise In-Parameter Suppression for Concept EraSure in Large Language Models
☆13Jun 28, 2026Updated 3 weeks ago
xiye17 / EvalQAExpl
View on GitHub
Code for Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals.
☆17Apr 25, 2021Updated 5 years ago
shunk031 / human-attention-map-for-text-classification
View on GitHub
Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2…
☆17Jul 10, 2020Updated 6 years ago
ApolloResearch / apd
View on GitHub
Attribution-based Parameter Decomposition
☆35Jun 11, 2025Updated last year
THU-KEG / DICE
View on GitHub
DICE: Detecting In-distribution Data Contamination with LLM's Internal State
☆12Sep 21, 2024Updated last year
zouharvi / subset2evaluate
View on GitHub
Find informative examples to efficiently (human)-evaluate NLG models.
☆17Apr 22, 2026Updated 3 months ago
mt-upc / transformer-contributions-nmt
View on GitHub
☆18Oct 6, 2022Updated 3 years ago
hannamw / MIB-circuit-track
View on GitHub
☆24Jun 30, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
TIGER-AI-Lab / TIGERScore
View on GitHub
"TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks" [TMLR 2024]
☆32Dec 21, 2024Updated last year
jmerullo / lm_vector_arithmetic
View on GitHub
☆37May 28, 2023Updated 3 years ago
marcus-jw / Targeted-Manipulation-and-Deception-in-LLMs
View on GitHub
Codebase for "On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback". This repo implements a generative multi-tur…
☆25Dec 3, 2024Updated last year
mohsenfayyaz / DecompX
View on GitHub
DecompX: Explaining Transformers Decisions by Propagating Token Decomposition [ACL 2023]
☆19Jul 3, 2025Updated last year
YueJiang-nj / EyeFormer-UIST2024
View on GitHub
Code Release for the paper EyeFormer: Predicting Scanpaths in Free-Viewing Tasks with Transformer-Guided Reinforcement Learning.
☆16Jan 29, 2026Updated 5 months ago
mohsenfayyaz / GlobEnc
View on GitHub
[NAACL 2022] GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers
☆21May 16, 2023Updated 3 years ago
keing1 / reward-hack-generalization
View on GitHub
Datasets used in the paper "Reward hacking behavior can generalize across tasks"
☆15Aug 17, 2025Updated 11 months ago
McGill-NLP / feedbackqa
View on GitHub
FeedbackQA: Improving Question Answering Post-Deployment with Interactive Feedback
☆12Jul 13, 2022Updated 4 years ago
DanielSc4 / Dynamic-Activation-Composition
View on GitHub
Materials for "Multi-property Steering of Large Language Models with Dynamic Activation Composition"
☆14Nov 22, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
GChrysostomou / ood_faith
View on GitHub
☆13Jul 26, 2023Updated 3 years ago
raybears / cot-transparency
View on GitHub
Improving transparency of large language models' reasoning
☆15Nov 25, 2025Updated 8 months ago
archiki / ReCEval
View on GitHub
Supporting code for ReCEval paper
☆32Sep 14, 2024Updated last year
derpylz / babyplots
View on GitHub
Babyplots is an easy to use library for creating interactive 3d graphs for exploring and presenting data.
☆27Updated this week
Jiaxin-Wen / MisleadLM
View on GitHub
Official Code for our paper: "Language Models Learn to Mislead Humans via RLHF""
☆20Oct 11, 2024Updated last year
batu-el / molochs-bargain
View on GitHub
☆15May 7, 2026Updated 2 months ago
openai / monitorability-evals
View on GitHub
Open-sourced evaluation suite from the Monitoring Monitorability paper
☆88Jun 11, 2026Updated last month
milesaturpin / cot-unfaithfulness
View on GitHub
☆57Oct 23, 2023Updated 2 years ago
ALT-JS / OthelloSAE
View on GitHub
CS194-196 Course Project
☆14Feb 20, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
XieJia15 / UCB-pacman-AI-projects
View on GitHub
my solution for UC Berkeley AI projects pacman
☆11Jul 25, 2020Updated 6 years ago
anthropics / sycophancy-to-subterfuge-paper
View on GitHub
☆28Sep 5, 2024Updated last year
allenai / few_shot_explanations
View on GitHub
Code for NAACL 2022 paper "Reframing Human-AI Collaboration for Generating Free-Text Explanations"
☆29Apr 28, 2023Updated 3 years ago
MadryLab / AT2
View on GitHub
Attribute statements generated by LLMs to preceding tokens using attention weights.
☆28Apr 22, 2025Updated last year
TransluceAI / circuits
View on GitHub
ADAG: Transluce's MLP neuron-level circuit tracing library
☆34Apr 10, 2026Updated 3 months ago
jettjaniak / chainscope
View on GitHub
Repository for the "Chain-of-Thought Reasoning In The Wild Is Not Always Faithful" paper
☆35Mar 31, 2026Updated 3 months ago
leafy-lee / E-commercial-dataset
View on GitHub
the dataset of electronic commercial image used for saliency etc.
☆18Apr 7, 2024Updated 2 years ago