hannamw / eap-ig-faithfulnessView external linksLinks
Code for "Automatic Circuit Finding and Faithfulness"
☆16Jul 11, 2024Updated last year
Alternatives and similar repositories for eap-ig-faithfulness
Users that are interested in eap-ig-faithfulness are comparing it to the libraries listed below
Sorting:
- ☆71Jul 24, 2025Updated 6 months ago
- Exploring Model Kinship for Merging Large Language Models☆27Apr 16, 2025Updated 10 months ago
- Repository for "Propagating Knowledge Updates to LMs Through Distillation" (NeurIPS 2023).☆26Aug 25, 2024Updated last year
- ☆28May 4, 2023Updated 2 years ago
- ☆32Jan 13, 2025Updated last year
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆163Nov 14, 2025Updated 3 months ago
- ☆10Oct 2, 2024Updated last year
- https://icml.cc/virtual/2023/poster/24354☆10Aug 15, 2023Updated 2 years ago
- A Unifying Principle from Fundamental Particles to Digital Being 从基本粒子到数字存在的统一原理☆27Feb 5, 2026Updated last week
- ☆12Feb 23, 2022Updated 3 years ago
- Repository containing the scripts regarding analyses in Zufferey & Tavernari et al.☆10Feb 22, 2021Updated 4 years ago
- Reading comprehension based question-answering model for news articles.☆11Jun 22, 2022Updated 3 years ago
- ☆269Oct 1, 2024Updated last year
- Novel character relationship analytics system☆11Apr 16, 2017Updated 8 years ago
- Reproduction Code for Paper "Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models"☆13Jun 1, 2024Updated last year
- Pre-trained Online Contrastive Learning for Insurance Fraud Detection☆12Jul 12, 2024Updated last year
- TYPO3 Extension ⇢ Integration of sendinblue as finisher of the form extension☆11Jan 23, 2025Updated last year
- Code of our paper "Method-Level Bug Severity Prediction using Source Code Metrics and LLMs" which is accepted to ISSRE 2023.☆10Nov 12, 2023Updated 2 years ago
- MID (Mutual Information Dimension) for measuring statistical dependence between two random variables☆12Apr 21, 2013Updated 12 years ago
- [NeurIPS 2022] Explaining Graph Neural Networks with Structure-Aware Cooperative Games (GStarX)☆14Oct 20, 2022Updated 3 years ago
- [EMNLP 2025] Circuit-Aware Editing Enables Generalizable Knowledge Learners☆18Nov 17, 2025Updated 3 months ago
- Evaluate interpretability methods on localizing and disentangling concepts in LLMs.☆57Oct 30, 2025Updated 3 months ago
- The code for paper Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models.☆13Apr 10, 2024Updated last year
- ☆10Jun 5, 2021Updated 4 years ago
- [USENIX Security 2025] SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks☆19Sep 18, 2025Updated 5 months ago
- A list of Numerical Multimodal reasoning papers and their implementation☆11May 13, 2024Updated last year
- code for EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis☆12Nov 17, 2024Updated last year
- code for EMNLP 2024 paper: How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for M…☆13Nov 17, 2024Updated last year
- Code for Mateen: Adaptive Ensemble Learning for Network Anomaly Detection☆16Feb 27, 2025Updated 11 months ago
- ☆13Oct 9, 2024Updated last year
- ☆14May 30, 2024Updated last year
- ☆10Feb 3, 2025Updated last year
- Explore/examine/explain/expose your model with the explabox!☆19Oct 14, 2025Updated 4 months ago
- ☆17Jun 18, 2025Updated 8 months ago
- ☆21Jun 22, 2025Updated 7 months ago
- ☆20Nov 15, 2024Updated last year
- A Pytorch implementation of the Mutual Information Neural Estimator☆12Jan 2, 2024Updated 2 years ago
- [EMNLP 2022] Distillation-Resistant Watermarking (DRW) for Model Protection in NLP☆13Aug 17, 2023Updated 2 years ago
- Webpack config for Angular JS 1.7☆11Dec 8, 2022Updated 3 years ago