rushil-thareja / dp-fusion-libLinks
A python package for text sanitization with differential privacy
☆24Updated 2 weeks ago
Alternatives and similar repositories for dp-fusion-lib
Users that are interested in dp-fusion-lib are comparing it to the libraries listed below
Sorting:
- ☆83Updated 10 months ago
- ☆383Updated 4 months ago
- ☆140Updated last week
- Evaluate interpretability methods on localizing and disentangling concepts in LLMs.☆57Updated 2 months ago
- ☆227Updated last year
- ☆68Updated 10 months ago
- ☆57Updated 11 months ago
- ☆58Updated last year
- ☆90Updated 9 months ago
- ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).☆331Updated 5 months ago
- ☆14Updated last year
- PyTorch library for Active Fine-Tuning☆96Updated 3 months ago
- AI Logging for Interpretability and Explainability🔬☆138Updated last year
- ☆193Updated last year
- ☆80Updated 3 years ago
- Simple and scalable tools for data-driven pretraining data selection.☆29Updated 7 months ago
- [ICLR 25] A novel framework for building intrinsically interpretable LLMs with human-understandable concepts to ensure safety, reliabilit…☆28Updated 4 months ago
- ☆112Updated 11 months ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆63Updated last year
- ☆23Updated last year
- Code for Zero-Shot Tokenizer Transfer☆142Updated 11 months ago
- Using sparse coding to find distributed representations used by neural networks.☆290Updated 2 years ago
- LLM-Merging: Building LLMs Efficiently through Merging☆208Updated last year
- Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"☆44Updated last year
- 👋 Overcomplete is a Vision-based SAE Toolbox☆112Updated last month
- ☆132Updated 2 years ago
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆74Updated 6 months ago
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆108Updated 2 years ago
- A toolkit for quantitative evaluation of data attribution methods.☆54Updated 5 months ago
- ☆32Updated 11 months ago