Mapping out the "memory" of neural nets with data attribution
☆49Mar 19, 2026Updated this week
Alternatives and similar repositories for bergson
Users that are interested in bergson are comparing it to the libraries listed below
Sorting:
- Attribution-based Parameter Decomposition☆34Jun 11, 2025Updated 9 months ago
- ☆158Dec 30, 2025Updated 2 months ago
- [NeurIPS 2025 MechInterp Workshop - Spotlight] Official implementation of the paper "RelP: Faithful and Efficient Circuit Discovery in La…☆27Nov 3, 2025Updated 4 months ago
- Efficiently computing & storing token n-grams from large corpora☆27Oct 6, 2024Updated last year
- Engine for collecting, uploading, and downloading model activations☆26Apr 2, 2025Updated 11 months ago
- Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…☆13Jan 26, 2025Updated last year
- Repository for PURE: Turning Polysemantic Neurons Into Pure Features by Identifying Relevant Circuits, accepted at CVPR 2024 XAI4CV Works…☆20May 29, 2024Updated last year
- ☆32Sep 28, 2025Updated 5 months ago
- explainable Siamese sentence transformers☆13Mar 26, 2024Updated last year
- A repo for generating random NFTs with metadata 100% on chain!☆37Mar 8, 2024Updated 2 years ago
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆34Feb 27, 2025Updated last year
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Jul 12, 2024Updated last year
- Landing page for MIB: A Mechanistic Interpretability Benchmark☆24Aug 15, 2025Updated 7 months ago
- Hugging Face Jobs☆19Jul 11, 2025Updated 8 months ago
- ☆19Feb 8, 2024Updated 2 years ago
- ☆16May 1, 2025Updated 10 months ago
- A toy Inspect implementation of the Bliss Attractor eval from Claude 4 System Card Welfare Assessment☆38Jun 5, 2025Updated 9 months ago
- In the interest of transparency and/or their great value to society, I'm releasing my smaller projects/scripts upon an unsuspecting publi…☆10Jan 30, 2026Updated last month
- 🚀💼☆18Feb 13, 2026Updated last month
- Benchmark to estimate model sycophancy☆23Nov 30, 2025Updated 3 months ago
- ☆12Jan 9, 2024Updated 2 years ago
- ☆30Mar 13, 2026Updated last week
- [SIGMOD2026] Reveal Hidden Pitfalls and Navigate Next Generation of Vector Similarity Search with Task-Centric Benchmarks☆24Dec 31, 2025Updated 2 months ago
- Simple SMTP server for teaching Rust.☆13Jan 20, 2021Updated 5 years ago
- ☆23Jan 27, 2025Updated last year
- GitHub repository for DORA: Data-agnOstic Representation Analysis paper. DORA allows to find outlier representations in Deep Neural Netwo…☆27Mar 19, 2023Updated 3 years ago
- PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)☆20Jan 19, 2025Updated last year
- GoldFinch and other hybrid transformer components☆12Dec 9, 2025Updated 3 months ago
- ☆30Jan 12, 2026Updated 2 months ago
- My private config for Doom Emacs☆11Sep 30, 2022Updated 3 years ago
- Explainability of Deep RL algorithms using graph networks and layer-wise relevance propagation.☆11Aug 20, 2024Updated last year
- ☆24Jan 28, 2025Updated last year
- H-Net Dynamic Hierarchical Architecture☆81Sep 11, 2025Updated 6 months ago
- A tool for model sparse based on torch.fx☆13Jun 3, 2024Updated last year
- ☆48Feb 23, 2025Updated last year
- Open source replication of Anthropic's Crosscoders for Model Diffing☆63Oct 27, 2024Updated last year
- ☆13Oct 5, 2025Updated 5 months ago
- All-in-One Safety Evaluation Framwork☆46Mar 4, 2026Updated 2 weeks ago
- ☆87Updated this week