spencerwooo / torchattack
π‘ A curated list of adversarial attacks in PyTorch, with a focus on transferable black-box attacks.
β52Updated last week
Alternatives and similar repositories for torchattack:
Users that are interested in torchattack are comparing it to the libraries listed below
- β12Updated last year
- Convert tensorflow model to pytorch model via [MMdnn](https://github.com/microsoft/MMdnn) for adversarial attacks.β84Updated 2 years ago
- π up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.β226Updated this week
- β79Updated 3 years ago
- [USENIX Security'24] Official repository of "Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise aβ¦β68Updated 4 months ago
- This is the official implementation of our paper 'Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protectiβ¦β53Updated 11 months ago
- Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks (IEEE S&P 2024)β34Updated 11 months ago
- This is an official repository of ``VLAttack: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models'' (NeurIPS 2β¦β46Updated 4 months ago
- [USENIX'24] Prompt Stealing Attacks Against Text-to-Image Generation Modelsβ33Updated last month
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking β¦β17Updated 4 months ago
- Code for ACM MM2024 paper: White-box Multimodal Jailbreaks Against Large Vision-Language Modelsβ22Updated 2 months ago
- β24Updated 7 months ago
- [ICLR 2024] Inducing High Energy-Latency of Large Vision-Language Models with Verbose Imagesβ28Updated last year
- [NDSS 2025] Official code for our paper "Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Wateβ¦β30Updated 3 months ago
- Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Modelsβ203Updated 9 months ago
- A list of papers in NeurIPS 2022 related to adversarial attack and defense / AI security.β71Updated 2 years ago
- β92Updated last year
- A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)β131Updated last week
- Paper list of Adversarial Examplesβ46Updated last year
- β43Updated last year
- [AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Promptsβ113Updated last week
- β16Updated last month
- β41Updated 6 months ago
- [NeurIPS'2023] Official Code Repo:Diffusion-Based Adversarial Sample Generation for Improved Stealthiness and Controllabilityβ99Updated last year
- β41Updated 2 months ago
- β25Updated 5 months ago
- Official Implementation for "Towards Reliable Verification of Unauthorized Data Usage in Personalized Text-to-Image Diffusion Models" (IEβ¦β13Updated 2 months ago