ShenzheZhu / JailDAMLinks
JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model
☆13Updated last month
Alternatives and similar repositories for JailDAM
Users that are interested in JailDAM are comparing it to the libraries listed below
Sorting:
- The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"☆37Updated last year
- EMPO, A Fully Unsupervised RLVR Method☆40Updated 2 weeks ago
- A Task of Fictitious Unlearning for VLMs☆19Updated 2 months ago
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…☆44Updated 6 months ago
- ☆24Updated 2 months ago
- [ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"☆81Updated last year
- The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?☆31Updated 7 months ago
- ECSO (Make MLLM safe without neither training nor any external models!) (https://arxiv.org/abs/2403.09572)☆25Updated 7 months ago
- [ICLR 2025] PyTorch Implementation of "ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time"☆23Updated this week
- [ICML 2024] "Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection"☆12Updated 4 months ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆13Updated last year
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆71Updated 4 months ago
- ☆27Updated 2 months ago
- ☆11Updated 4 months ago
- [ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shi…☆59Updated 11 months ago
- Official implementation for "ALI-Agent: Assessing LLMs'Alignment with Human Values via Agent-based Evaluation"☆18Updated last month
- ☆18Updated last year
- [ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.☆73Updated 5 months ago
- ☆20Updated last month
- [NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"☆34Updated 5 months ago
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆18Updated this week
- Official code base for "Long-Tailed Diffusion Models With Oriented Calibration" ICLR2024☆12Updated 11 months ago
- ☆26Updated last year
- ☆57Updated 7 months ago
- ☆27Updated last year
- [FCS'24] LVLM Safety paper☆18Updated 5 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆34Updated 2 months ago
- AutoHallusion Codebase (EMNLP 2024)☆19Updated 6 months ago
- LCA-on-the-line (ICML 2024 Oral)☆12Updated 4 months ago
- ☆19Updated 9 months ago