[COLM 2025] JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model
☆25Nov 25, 2025Updated 3 months ago
Alternatives and similar repositories for JailDAM
Users that are interested in JailDAM are comparing it to the libraries listed below
Sorting:
- [CVPR 2025] Official implementation for "Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbre…☆53Jul 5, 2025Updated 8 months ago
- [ICLR 2025] PyTorch Implementation of "ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time"☆30Jul 20, 2025Updated 7 months ago
- ☆16May 12, 2025Updated 9 months ago
- [ACL 2025 Findings] Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducemen…☆27Jun 29, 2025Updated 8 months ago
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆23Dec 8, 2024Updated last year
- [COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and fur…☆88May 9, 2025Updated 10 months ago
- The code for the paper "Pre-trained Vision-Language Models Learn Discoverable Concepts"☆21Jun 5, 2024Updated last year
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated 11 months ago
- Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models☆30Oct 6, 2025Updated 5 months ago
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆25Nov 29, 2024Updated last year
- [CVPR2025] VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding☆24Mar 24, 2025Updated 11 months ago
- spatio-temporal tasks☆16Jul 15, 2024Updated last year
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- ☆73Mar 30, 2025Updated 11 months ago
- [AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data☆33Apr 7, 2025Updated 11 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- Sotopia-RL: Reward Design for Social Intelligence☆46Jan 29, 2026Updated last month
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- ☆14Sep 20, 2025Updated 5 months ago
- [ICLR 2025] "GraphRouter: A Graph-based Router for LLM Selections", Tao Feng, Yanzhen Shen, Jiaxuan You☆61Dec 30, 2025Updated 2 months ago
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆42Updated this week
- ☆11Jun 22, 2025Updated 8 months ago
- ☆23Feb 10, 2026Updated 3 weeks ago
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 5 months ago
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 4 months ago
- Official repository for K-EXAONE built by LG AI Research☆69Feb 6, 2026Updated last month
- ☆14Aug 7, 2025Updated 7 months ago
- [Neurips 2025]StegoZip: Enhancing Linguistic Steganography Payload in Practice with Large Language Models☆26Dec 4, 2025Updated 3 months ago
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆29Oct 23, 2025Updated 4 months ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs☆42Feb 27, 2026Updated last week
- Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs☆34Sep 3, 2024Updated last year
- A CNN-BiLSTM model for Li-ion battery state of health and remaining useful life prediction☆11Mar 25, 2024Updated last year
- ☆44Jun 19, 2025Updated 8 months ago
- ☆93Dec 30, 2025Updated 2 months ago
- GenoCraft: A Comprehensive, User-Friendly Web Platform for High-Throughput Omics Data Analysis and Visualization (https://arxiv.org/pdf/2…☆19May 28, 2025Updated 9 months ago
- ☆16Sep 17, 2024Updated last year
- 基于 卷积自编码器和图像金字塔的布料缺陷无监督学习与检测方法☆10Jun 28, 2023Updated 2 years ago