dynamical-inference / patchsaeLinks
Implementation of PatchSAE as presented in "Sparse autoencoders reveal selective remapping of visual concepts during adaptation"
☆29Updated 3 months ago
Alternatives and similar repositories for patchsae
Users that are interested in patchsae are comparing it to the libraries listed below
Sorting:
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆110Updated 2 years ago
- Localization of Knowledge in Text-to-Image Models☆12Updated last year
- ☆24Updated last year
- What do we learn from inverting CLIP models?☆58Updated last year
- [ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…☆12Updated 11 months ago
- Code for the paper: Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery. ECCV 2024.☆56Updated last year
- Sparse autoencoders for vision☆55Updated last week
- ☆79Updated last year
- [ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation…☆140Updated 8 months ago
- Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)☆37Updated 2 years ago
- 👋 Overcomplete is a Vision-based SAE Toolbox☆118Updated 2 months ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆52Updated last month
- ☆143Updated last month
- Repository for PURE: Turning Polysemantic Neurons Into Pure Features by Identifying Relevant Circuits, accepted at CVPR 2024 XAI4CV Works…☆19Updated last year
- [ICLR 25] A novel framework for building intrinsically interpretable LLMs with human-understandable concepts to ensure safety, reliabilit…☆30Updated 5 months ago
- FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens☆17Updated 4 months ago
- Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"☆19Updated last year
- ☆28Updated 2 months ago
- [CVPR 2025] Official implementation for "Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbre…☆50Updated 7 months ago
- [ICML 2025] No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces (official repository)☆34Updated 6 months ago
- Erasing conceptual knowledge from language models through low-rank fine-tuning☆19Updated 10 months ago
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆79Updated last year
- ☆14Updated last year
- [NeurIPS 2025] Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models☆60Updated 2 months ago
- [ICML 2024] Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models☆154Updated 8 months ago
- [ICLR 2025] PyTorch Implementation of "ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time"☆29Updated 6 months ago
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆42Updated 2 weeks ago
- Official repo for Detecting, Explaining, and Mitigating Memorization in Diffusion Models (ICLR 2024)☆77Updated last year
- ☆16Updated 9 months ago
- [CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment☆26Updated 7 months ago