Vaidehi99 / InfoDeletionAttacksView external linksLinks
☆48Feb 8, 2025Updated last year
Alternatives and similar repositories for InfoDeletionAttacks
Users that are interested in InfoDeletionAttacks are comparing it to the libraries listed below
Sorting:
- ☆60Mar 9, 2023Updated 2 years ago
- [EMNLP 2025 Oral] IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents☆16Sep 16, 2025Updated 5 months ago
- ☆17Nov 20, 2024Updated last year
- ☆15Apr 7, 2023Updated 2 years ago
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆15Feb 9, 2026Updated last week
- [NeurIPS'22] Official Repository for Characterizing Datapoints via Second-Split Forgetting☆16Aug 11, 2023Updated 2 years ago
- CMD: a framework for Context-aware Model self-Detoxification (EMNLP2024 Long Paper)☆17Feb 10, 2025Updated last year
- ☆13Nov 8, 2022Updated 3 years ago
- ☆11Apr 4, 2023Updated 2 years ago
- ☆65Sep 29, 2024Updated last year
- code space of paper "Safety Layers in Aligned Large Language Models: The Key to LLM Security" (ICLR 2025)☆21Apr 26, 2025Updated 9 months ago
- ☆17Nov 30, 2022Updated 3 years ago
- Code for the paper "Quantifying Privacy Leakage in Graph Embedding" published in MobiQuitous 2020☆17Nov 11, 2021Updated 4 years ago
- The repository contains the code for analysing the leakage of personally identifiable (PII) information from the output of next word pred…☆103Aug 13, 2024Updated last year
- The code for the ACL 2023 paper "Linear Classifier: An Often-Forgotten Baseline for Text Classification".☆19Jun 29, 2024Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated 10 months ago
- ☆44Apr 25, 2023Updated 2 years ago
- ☆47Jul 14, 2024Updated last year
- NeuSyRE: A Neuro-Symbolic Visual Understanding and Reasoning Framework based on Scene Graph Enrichment☆22Mar 10, 2024Updated last year
- Code for safety test in "Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates"☆22Sep 21, 2025Updated 4 months ago
- For Certified Robustness to Text Adversarial Attacks by Randomized [MASK]☆17Oct 8, 2024Updated last year
- [ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models☆86Sep 12, 2024Updated last year
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆83Dec 21, 2024Updated last year
- Code for the ACL-2022 paper "Knowledge Neurons in Pretrained Transformers"☆173May 4, 2024Updated last year
- Official PyTorch Implementation for Continual Learning and Private Unlearning☆18Jul 19, 2022Updated 3 years ago
- ☆26Feb 25, 2025Updated 11 months ago
- OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)☆100Feb 4, 2025Updated last year
- ☆45Nov 10, 2019Updated 6 years ago
- An Embarrassingly Simple Backdoor Attack on Self-supervised Learning☆20Jan 24, 2024Updated 2 years ago
- Official Implementation of ICLR 2022 paper, ``Adversarial Unlearning of Backdoors via Implicit Hypergradient''☆53Nov 16, 2022Updated 3 years ago
- Erasing concepts from neural representations with provable guarantees☆243Jan 27, 2025Updated last year
- Backdoor Safety Tuning (NeurIPS 2023 & 2024 Spotlight)☆27Nov 18, 2024Updated last year
- quick playground to animate pippin☆14Nov 11, 2024Updated last year
- This repo keeps track of popular provable training and verification approaches towards robust neural networks, including leaderboards on …☆98Oct 18, 2022Updated 3 years ago
- Code for paper "Poisoned classifiers are not only backdoored, they are fundamentally broken"☆26Jan 7, 2022Updated 4 years ago
- The code and data for "Are Large Pre-Trained Language Models Leaking Your Personal Information?" (Findings of EMNLP '22)☆28Oct 31, 2022Updated 3 years ago
- LLM Unlearning☆181Oct 20, 2023Updated 2 years ago
- [ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.☆2,716Feb 9, 2026Updated last week
- ☆59Jun 17, 2020Updated 5 years ago