jzhang538 / BadMergingLinks

[CCS 2024] "BadMerging: Backdoor Attacks Against Model Merging": official code implementation.

☆28

Alternatives and similar repositories for BadMerging

Users that are interested in BadMerging are comparing it to the libraries listed below

Sorting:

chs20 / RobustVLM
[ICML 2024] Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
☆135Updated last month
VILA-Lab / M-Attack
A Simple Baseline Achieving Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1. Paper at: https://arxiv.org/abs/2…
☆65Updated 3 months ago
sail-sg / MMCBench
☆27Updated last year
VILA-Lab / DELT
(CVPR 2025) Official implementation to DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation which outperforms SOTA…
☆23Updated 2 months ago
sail-sg / AnyDoor
AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models
☆55Updated last year
OPTML-Group / Unlearn-Saliency
[ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation…
☆126Updated last month
sail-sg / Rigging-ChatbotArena
Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)
☆21Updated 4 months ago
UCSC-VLAA / FedConv
[TMLR'24] This repository includes the official implementation our paper "FedConv: Enhancing Convolutional Neural Networks for Handling D…
☆25Updated last year
sail-sg / Agent-Smith
[ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
☆107Updated last year
ShiJiawenwen / JudgeDeceiver
[CCS 2024] Optimization-based Prompt Injection Attack to LLM-as-a-Judge
☆25Updated 8 months ago
umd-huang-lab / VLM-Poisoning
Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"
☆50Updated 6 months ago
ChenWu98 / agent-attack
[ICLR 2025] Dissecting adversarial robustness of multimodal language model agents
☆97Updated 4 months ago
sail-sg / Meta-Unlearning
☆27Updated 2 months ago
ThuCCSLab / MergeGuard
[CCS-LAMPS'24] LLM IP Protection Against Model Merging
☆15Updated 9 months ago
KuofengGao / Verbose_Images
[ICLR 2024] Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images
☆36Updated last year
OPTML-Group / Diffusion-MU-Attack
The official implementation of ECCV'24 paper "To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Uns…
☆78Updated 4 months ago
kairanzhao / RUM
[NeurIPS24] "What makes unlearning hard and what to do about it" [NeurIPS24] "Scalability of memorization-based machine unlearning"
☆17Updated last month
UCDvision / backdoor_transformer
☆31Updated 2 years ago
1xbq1 / FedMLLM
☆27Updated 4 months ago
franciscoliu / Awesome-GenAI-Unlearning
☆153Updated 3 months ago
kstanghere / GenderCARE-ccs24
This repository contains the source code, datasets, and scripts for the paper "GenderCARE: A Comprehensive Framework for Assessing and Re…
☆23Updated 10 months ago
THU-BPM / Watermarked_LLM_Identification
Code and data for paper "Can Watermarked LLMs be Identified by Users via Crafted Prompts?" Accepted by ICLR 2025 (Spotlight)
☆23Updated 6 months ago
OngWinKent / Federated-Feature-Unlearning
[NeurIPS 2024] Official implementation of the paper “Ferrari: Federated Feature Unlearning via Optimizing Feature Sensitivity"
☆18Updated 4 months ago
ShawnXYang / C-GSP
☆14Updated 2 years ago
RobustNLP / DeRTa
A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.
☆61Updated last month
AtsuMiyai / UPD
[ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models
☆77Updated last month
AoiDragon / HADES
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …
☆29Updated 9 months ago
AI45Lab / REEF
The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source…
☆57Updated 6 months ago
thu-ml / MMTrustEval
A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)
☆154Updated 3 weeks ago
salman-lui / x-teaming
☆29Updated last month