☆73May 7, 2026Updated 2 weeks ago
Alternatives and similar repositories for causalab
Users that are interested in causalab are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆24Jun 30, 2025Updated 10 months ago
- Landing page for MIB: A Mechanistic Interpretability Benchmark☆25Aug 15, 2025Updated 9 months ago
- Multi-Layer Sparse Autoencoders (ICLR 2025)☆30Feb 6, 2026Updated 3 months ago
- Evaluate interpretability methods on localizing and disentangling concepts in LLMs.☆58Oct 30, 2025Updated 6 months ago
- Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"☆21Dec 14, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆19Aug 19, 2025Updated 9 months ago
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- ☆25Jun 13, 2024Updated last year
- ☆28Nov 28, 2024Updated last year
- Open source replication of Anthropic's Crosscoders for Model Diffing☆66Oct 27, 2024Updated last year
- Applying SAEs for fine-grained control☆27Dec 15, 2024Updated last year
- [ICML 2025] Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions☆14Mar 7, 2026Updated 2 months ago
- ☆107Aug 8, 2024Updated last year
- A library for efficient patching and automatic circuit discovery.☆97Dec 31, 2025Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Generative tree visualiser for Python☆16Sep 15, 2020Updated 5 years ago
- A Mechanistic Interpretability Analysis of Grokking☆27Sep 26, 2022Updated 3 years ago
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆53Nov 30, 2024Updated last year
- This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acce…☆25Feb 16, 2026Updated 3 months ago
- YesBut - Multimodal Satire Comprehension Dataset☆19Oct 23, 2024Updated last year
- PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)☆20Jan 19, 2025Updated last year
- Kim, J., Evans, J., & Schein, A. (2025). Linear Representations of Political Perspective Emerge in Large Language Models. ICLR.☆25Mar 27, 2025Updated last year
- Calling disease-related genes☆16Apr 1, 2026Updated last month
- d3heatmap is a Python package to create interactive heatmaps based on d3js.☆11Sep 14, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆33Nov 16, 2025Updated 6 months ago
- Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".☆390Jun 13, 2025Updated 11 months ago
- the handbook for nilenso☆14Feb 6, 2024Updated 2 years ago
- Xie's R Archive Network (experimental and for my personal interest only)☆26Sep 6, 2021Updated 4 years ago
- The Internet Memes Knowledge Graph☆18Oct 18, 2024Updated last year
- GLADIS: A General and Large Acronym Disambiguation Benchmark (EACL 23)☆18Jun 24, 2024Updated last year
- ☆19Apr 4, 2025Updated last year
- ☆18Jul 3, 2024Updated last year
- ☆19Apr 10, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Python evaluation scripts for AIDA-formatted CoNLL data☆20Aug 4, 2014Updated 11 years ago
- PyTorch implementation of Swap-VAE: A self-supervised approach for generating neural activity☆13Nov 17, 2021Updated 4 years ago
- Create string diagrams with LaTeX!☆14Jan 3, 2025Updated last year
- Algebraic value editing in pretrained language models☆70Nov 1, 2023Updated 2 years ago
- ☆13Mar 12, 2024Updated 2 years ago
- NeuroSurgeon is a package that enables researchers to uncover and manipulate subnetworks within models in Huggingface Transformers☆43Feb 12, 2025Updated last year
- ☆26Sep 3, 2025Updated 8 months ago