Vaidehi99/InfoDeletionAttacks

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Vaidehi99/InfoDeletionAttacks)

Vaidehi99 / InfoDeletionAttacks

☆49

Alternatives and similar repositories for InfoDeletionAttacks

Users that are interested in InfoDeletionAttacks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ejones313 / auditing-llms
View on GitHub
☆61Mar 9, 2023Updated 3 years ago
ZetangForward / CMD-Context-aware-Model-self-Detoxification
View on GitHub
CMD: a framework for Context-aware Model self-Detoxification (EMNLP2024 Long Paper)
☆17Feb 10, 2025Updated last year
starrYYxuan / UniTE
View on GitHub
☆17Nov 20, 2024Updated last year
clear-nus / selective-amnesia
View on GitHub
☆65Sep 29, 2024Updated last year
NYU-DICE-Lab / circumventing-concept-erasure
View on GitHub
☆23Feb 5, 2026Updated 5 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
pratyushmaini / ssft
View on GitHub
[NeurIPS'22] Official Repository for Characterizing Datapoints via Second-Split Forgetting
☆16Aug 11, 2023Updated 2 years ago
Cranial-XIX / Continual-Learning-Private-Unlearning
View on GitHub
Official PyTorch Implementation for Continual Learning and Private Unlearning
☆19Jul 19, 2022Updated 4 years ago
ZJU-LLM-Safety / HarmMetric_Eval
View on GitHub
☆15Apr 1, 2026Updated 3 months ago
lchen001 / HAPI
View on GitHub
☆16Nov 30, 2022Updated 3 years ago
ZJU-LLM-Safety / MAJIC-AAAI2026
View on GitHub
[AAAI-2026]MAJIC: Markovian Adaptive Jailbreaking. An automated black-box attack framework against LLMs that iteratively selects and fuse…
☆16Updated this week
ZJU-LLM-Safety / BGPShield
View on GitHub
A prototype for innovative BGP anomaly detection system --- BGPShield.
☆18Apr 7, 2026Updated 3 months ago
Thartvigsen / GRACE
View on GitHub
[NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
☆86Dec 21, 2024Updated last year
DandanGuo1993 / reweight-imbalance-classification-with-OT
View on GitHub
☆13Nov 8, 2022Updated 3 years ago
irenasaracay / model-equality-testing
View on GitHub
Test equality between a black-box LLM API and a reference distribution
☆20Oct 29, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Dicer-Zz / EPI
View on GitHub
Code for the paper: Rehearsal-free Continual Language Learning via Efficient Parameter Isolation
☆13May 16, 2023Updated 3 years ago
google-research / lm-extraction-benchmark
View on GitHub
☆307Jun 10, 2026Updated last month
agentward-ai / agentward
View on GitHub
Open-source permission control plane for AI agents. Scan, enforce, and audit every tool call.
☆19Jul 8, 2026Updated 2 weeks ago
Greysahy / ipiguard
View on GitHub
[EMNLP 2025 Oral] IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents
☆22Sep 16, 2025Updated 10 months ago
TsinghuaC3I / FS-GEN
View on GitHub
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.
☆13Nov 19, 2024Updated last year
microsoft / analysing_pii_leakage
View on GitHub
The repository contains the code for analysing the leakage of personally identifiable (PII) information from the output of next word pred…
☆105Aug 13, 2024Updated last year
zjiehang / RanMASK
View on GitHub
For Certified Robustness to Text Adversarial Attacks by Randomized [MASK]
☆17Oct 8, 2024Updated last year
mjy1111 / BAKE
View on GitHub
This is the repository for our paper: Untying the Reversal Curse via Bidirectional Language Model Editing
☆11May 25, 2025Updated last year
vfleaking / PTST
View on GitHub
Code for safety test in "Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates"
☆22Sep 21, 2025Updated 10 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
YuanGongND / realtime-adversarial-attack
View on GitHub
Code for IJCAI 2019 paper "Real-time Adversarial Attack".
☆20Jul 4, 2020Updated 6 years ago
jaleedkhan / neusire
View on GitHub
NeuSyRE: A Neuro-Symbolic Visual Understanding and Reasoning Framework based on Scene Graph Enrichment
☆24Mar 10, 2024Updated 2 years ago
matchten / LoRA-Models-for-SAEs
View on GitHub
Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"
☆17Mar 31, 2025Updated last year
zjunlp / EasyEdit
View on GitHub
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
☆2,886Jul 14, 2026Updated 2 weeks ago
TrustAIResearch / MLHospital
View on GitHub
☆44Apr 25, 2023Updated 3 years ago
YitingQu / unsafe-diffusion
View on GitHub
☆50Jul 14, 2024Updated 2 years ago
carlwharris / cog-bias-med-LLMs
View on GitHub
Addressing common clinical biases in medical language models
☆17Jul 27, 2024Updated 2 years ago
val-iisc / Hard-Label-Model-Stealing
View on GitHub
☆34Mar 28, 2022Updated 4 years ago
Arvid-pku / ATOKE
View on GitHub
[AAAI 2024] History Matters: Temporal Knowledge Editing in Large Language Model
☆13Dec 17, 2023Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
AdityaGolatkar / SelectiveForgetting
View on GitHub
☆62Jun 17, 2020Updated 6 years ago
opendataval / opendataval
View on GitHub
OpenDataVal: a Unified Benchmark for Data Valuation in Python (NeurIPS 2023)
☆101Feb 4, 2025Updated last year
thu-coai / SafeUnlearning
View on GitHub
Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks
☆32Jul 9, 2024Updated 2 years ago
alessiomora / unlearning_fl
View on GitHub
☆17Feb 17, 2024Updated 2 years ago
chrisliu298 / gpt2-arxiv
View on GitHub
Fine-tuning GPT-2 to generate research paper abstracts
☆12Apr 28, 2021Updated 5 years ago
CreaLabs / Enhanced-BGE-M3-with-CLP-and-MoE
View on GitHub
This repository provides the code for applying Contrastive Learning Penalty Loss (CLPL) and Mixture of Experts (MoE) to the BGE-M3 text e…
☆11Dec 27, 2024Updated last year
OPTML-Group / Diffusion-MU-Attack
View on GitHub
The official implementation of ECCV'24 paper "To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Uns…
☆89Feb 28, 2025Updated last year