ShenzheZhu / JailDAMView external linksLinks
[COLM 2025] JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model
☆25Nov 25, 2025Updated 2 months ago
Alternatives and similar repositories for JailDAM
Users that are interested in JailDAM are comparing it to the libraries listed below
Sorting:
- [CVPR 2025] Official implementation for "Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbre…☆52Jul 5, 2025Updated 7 months ago
- [ICLR 2025] PyTorch Implementation of "ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time"☆29Jul 20, 2025Updated 6 months ago
- ☆16May 12, 2025Updated 9 months ago
- [ACL 2025 Findings] Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducemen…☆25Jun 29, 2025Updated 7 months ago
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆22Dec 8, 2024Updated last year
- [COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and fur…☆85May 9, 2025Updated 9 months ago
- Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models☆29Oct 6, 2025Updated 4 months ago
- The code for the paper "Pre-trained Vision-Language Models Learn Discoverable Concepts"☆20Jun 5, 2024Updated last year
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated 10 months ago
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆24Nov 29, 2024Updated last year
- [CVPR2025] VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding☆24Mar 24, 2025Updated 10 months ago
- spatio-temporal tasks☆15Jul 15, 2024Updated last year
- ☆72Mar 30, 2025Updated 10 months ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆13Jun 28, 2025Updated 7 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Feb 8, 2026Updated last week
- [AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data☆33Apr 7, 2025Updated 10 months ago
- Sotopia-RL: Reward Design for Social Intelligence☆46Jan 29, 2026Updated 2 weeks ago
- [Neurips 2025]StegoZip: Enhancing Linguistic Steganography Payload in Practice with Large Language Models☆24Dec 4, 2025Updated 2 months ago
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- ☆18Jun 10, 2025Updated 8 months ago
- ☆14Sep 20, 2025Updated 4 months ago
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆38Jan 26, 2026Updated 3 weeks ago
- Official repository for K-EXAONE built by LG AI Research☆66Feb 6, 2026Updated last week
- [ICLR 2025] "GraphRouter: A Graph-based Router for LLM Selections", Tao Feng, Yanzhen Shen, Jiaxuan You☆60Dec 30, 2025Updated last month
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 4 months ago
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆28Oct 23, 2025Updated 3 months ago
- A CNN-BiLSTM model for Li-ion battery state of health and remaining useful life prediction☆11Mar 25, 2024Updated last year
- ☆14Aug 7, 2025Updated 6 months ago
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 3 months ago
- Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs☆34Sep 3, 2024Updated last year
- ☆11Jun 22, 2025Updated 7 months ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- ☆44Jun 19, 2025Updated 7 months ago
- ☆88Dec 30, 2025Updated last month
- Project that regroup the state-of-the-art knowledge distillation approaches for unsupervised anomaly detection☆13Oct 10, 2025Updated 4 months ago
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.☆28Dec 30, 2025Updated last month
- ☆12Mar 25, 2024Updated last year
- This repository contains the implementation code for the paper "Metal Surface Defect Detection Using SLF-YOLO Enhanced YOLOv8 Model."☆19Feb 24, 2025Updated 11 months ago
- ☆12Oct 29, 2023Updated 2 years ago