dynamical-inference / patchsaeLinks
Implementation of PatchSAE as presented in "Sparse autoencoders reveal selective remapping of visual concepts during adaptation"
☆28Updated 3 weeks ago
Alternatives and similar repositories for patchsae
Users that are interested in patchsae are comparing it to the libraries listed below
Sorting:
- Sparse autoencoders for vision☆50Updated this week
- ☆23Updated 10 months ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆107Updated 2 years ago
- Localization of Knowledge in Text-to-Image Models☆11Updated last year
- Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)☆35Updated last year
- Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"☆17Updated 11 months ago
- What do we learn from inverting CLIP models?☆56Updated last year
- ☆22Updated last week
- A curated list of Awesome Personalized Large Multimodal Models resources☆47Updated 2 months ago
- ☆25Updated 4 months ago
- Official repository of "Localizing Task Information for Improved Model Merging and Compression" [ICML 2024]☆51Updated last year
- ☆73Updated last year
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆40Updated last year
- Official repo for Detecting, Explaining, and Mitigating Memorization in Diffusion Models (ICLR 2024)☆77Updated last year
- DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models (ICLR 2024)☆76Updated last year
- [ICML 2024] Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models☆148Updated 5 months ago
- [ICLR 25] A novel framework for building intrinsically interpretable LLMs with human-understandable concepts to ensure safety, reliabilit…☆28Updated 3 months ago
- ☆24Updated 5 months ago
- The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.☆11Updated 11 months ago
- [ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation…☆138Updated 6 months ago
- This is the repository for "Model Merging by Uncertainty-Based Gradient Matching", ICLR 2024.☆29Updated last year
- The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?☆41Updated last year
- Erasing conceptual knowledge from language models through low-rank fine-tuning☆19Updated 8 months ago
- Spurious Features Everywhere - Large-Scale Detection of Harmful Spurious Features in ImageNet☆32Updated 2 years ago
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆72Updated 8 months ago
- Code for the paper: Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery. ECCV 2024.☆51Updated last year
- [ICLR 23 spotlight] An automatic and efficient tool to describe functionalities of individual neurons in DNNs☆56Updated 2 years ago
- ☆16Updated 6 months ago
- Official pytorch implementation of "Interpreting the Second-Order Effects of Neurons in CLIP"☆40Updated last year
- ☆110Updated 9 months ago