ChengshuaiZhao0 / The-Wolf-Within
β10Updated last month
Related projects β
Alternatives and complementary repositories for The-Wolf-Within
- β26Updated 4 months ago
- [ICLR 2024 Spotlight π₯ ] - [ Best Paper Award SoCal NLP 2023 π] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modalβ¦β26Updated 5 months ago
- [ICLR 2023] Official repository of the paper "Rethinking the Effect of Data Augmentation in Adversarial Contrastive Learning"β17Updated last year
- [ECCV 2022] "Adversarial Contrastive Learning via Asymmetric InfoNCE"β22Updated last year
- [ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"β67Updated 11 months ago
- [ICLR 2024 Oral] Less is More: Fewer Interpretable Region via Submodular Subset Selectionβ72Updated last month
- β30Updated 5 months ago
- AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Modelsβ44Updated 7 months ago
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking β¦β20Updated last month
- One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Modelsβ37Updated last week
- β19Updated 3 weeks ago
- Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Modelβ¦β26Updated 2 weeks ago
- The official repository for paper "MLLM-Protector: Ensuring MLLMβs Safety without Hurting Performance"β31Updated 7 months ago
- Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794β¦β12Updated 3 months ago
- Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"β28Updated last month
- Official code for "TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization", CVPR 2023β13Updated last year
- Official PyTorch implementation of "CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning" @ ICCV 2023β29Updated 10 months ago
- β38Updated last year
- [ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.β45Updated 3 months ago
- β57Updated last month
- Evaluate robustness of adaptation methods on large vision-language modelsβ17Updated last year
- The official implementation of ECCV'24 paper "To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsβ¦β58Updated 2 weeks ago
- β12Updated 6 months ago
- [ECCV-2024] Transferable Targeted Adversarial Attack, CLIP models, Generative adversarial network, Multi-target attacksβ22Updated 3 months ago
- β16Updated 6 months ago
- Implementation of BadCLIP https://arxiv.org/pdf/2311.16194.pdfβ17Updated 7 months ago
- [ACL 2024] Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models. Detect and mitigate object hallucinatioβ¦β16Updated 5 months ago
- [NeurIPS-2023] Annual Conference on Neural Information Processing Systemsβ162Updated last year
- β15Updated last week
- β24Updated 5 months ago