jzhang538 / BadMerging
[CCS 2024] "BadMerging: Backdoor Attacks Against Model Merging": official code implementation.
☆26Updated 8 months ago
Alternatives and similar repositories for BadMerging
Users that are interested in BadMerging are comparing it to the libraries listed below
Sorting:
- A Simple Baseline Achieving Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1. Paper at: https://arxiv.org/abs/2…☆57Updated last month
- (CVPR 2025) Official implementation to DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation which outperforms SOTA…☆21Updated 2 weeks ago
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)☆20Updated 2 months ago
- [ICLR 2024] Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images☆33Updated last year
- [ICLR 2025] Dissecting adversarial robustness of multimodal language model agents☆84Updated 2 months ago
- This repository contains the source code, datasets, and scripts for the paper "GenderCARE: A Comprehensive Framework for Assessing and Re…☆21Updated 8 months ago
- Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing☆37Updated 5 months ago
- [ICML 2024] Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models☆130Updated 6 months ago
- ☆27Updated last year
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …☆29Updated 7 months ago
- The official implementation of ECCV'24 paper "To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Uns…☆75Updated 2 months ago
- [ICLR 2025] VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking (Official Implementation)☆35Updated last month
- Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"☆46Updated 4 months ago
- List of T2I safety papers, updated daily, welcome to discuss using Discussions☆60Updated 9 months ago
- This is a collection of awesome papers I have read (carefully or roughly) in the fields of security in diffusion models. Any suggestions …☆27Updated 6 months ago
- [ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shi…☆57Updated 10 months ago
- ☆22Updated 8 months ago
- [TMLR'24] This repository includes the official implementation our paper "FedConv: Enhancing Convolutional Neural Networks for Handling D…☆25Updated last year
- ☆11Updated 5 months ago
- ☆27Updated 3 weeks ago
- This is the official repo of the paper "Latent Guard: a Safety Framework for Text-to-image Generation"☆48Updated 6 months ago
- Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Model…☆40Updated 6 months ago
- [CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment☆15Updated 2 months ago
- Official Code for ACL 2024 paper "GradSafe: Detecting Unsafe Prompts for LLMs via Safety-Critical Gradient Analysis"☆56Updated 6 months ago
- ☆43Updated 3 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆76Updated 8 months ago
- [CVPR 2024] Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transfomers☆17Updated 6 months ago
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆59Updated 3 months ago
- official implementation of Towards Robust Model Watermark via Reducing Parametric Vulnerability☆14Updated 11 months ago
- [ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…☆52Updated 11 months ago