☆22Oct 25, 2024Updated last year
Alternatives and similar repositories for SafeBench
Users that are interested in SafeBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2025] PyTorch Implementation of "ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time"☆31Jul 20, 2025Updated 10 months ago
- ☆69Jun 1, 2025Updated 11 months ago
- ☆27Mar 17, 2025Updated last year
- ECSO (Make MLLM safe without neither training nor any external models!) (https://arxiv.org/abs/2403.09572)☆36Nov 2, 2024Updated last year
- The official repository for guided jailbreak benchmark☆29Jul 28, 2025Updated 9 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆22Dec 8, 2024Updated last year
- [ICLR 2025] BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks☆31Nov 2, 2025Updated 6 months ago
- ☆45Jun 19, 2025Updated 11 months ago
- ☆27Jun 5, 2024Updated last year
- Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.☆13Nov 19, 2024Updated last year
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …☆40Oct 17, 2024Updated last year
- [CVPR 2025] Official implementation for "Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbre…☆60Jul 5, 2025Updated 10 months ago
- [AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts☆204Jun 26, 2025Updated 10 months ago
- Code for ICCV2025 paper——IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves☆17Jul 11, 2025Updated 10 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [NDSS 2026] Official repo for Odysseus: Jailbreaking Commercial Multimodal LLM-integrated Systems via Dual Steganography☆35Mar 14, 2026Updated 2 months ago
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆35Jun 23, 2025Updated 10 months ago
- Prompt Generator model for Stable Diffusion Models☆12Jun 20, 2023Updated 2 years ago
- The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"☆46Apr 21, 2024Updated 2 years ago
- Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering [ACM MM'24]☆10Jul 22, 2024Updated last year
- Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)☆146Apr 7, 2025Updated last year
- [ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆90Mar 30, 2025Updated last year
- ☆18Jun 4, 2025Updated 11 months ago
- Teaching a Convolutional Neural Network to recognize painting genre. Handcrafted dataset. Cool visualizations.☆10Dec 19, 2018Updated 7 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [TOIS'24] "RecRanker: Instruction Tuning Large Language Model as Ranker for Top-k Recommendation"☆16Dec 1, 2024Updated last year
- Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion☆11Apr 1, 2024Updated 2 years ago
- ☆60Jun 5, 2024Updated last year
- Explore, Establish, Exploit: Red Teaming Language Models from Scratch☆15Jun 21, 2023Updated 2 years ago
- A list of research towards security&privacy in AI-Generated Content☆17Jan 10, 2025Updated last year
- Adversarial Item Promotion in visually-aware recommenders☆17Sep 3, 2021Updated 4 years ago
- Röttger et al. (2025): "MSTS: A Multimodal Safety Test Suite for Vision-Language Models"☆18Mar 31, 2025Updated last year
- The first toolkit for MLRM safety evaluation, providing unified interface for mainstream models, datasets, and jailbreaking methods!☆15Apr 8, 2025Updated last year
- Finite Element Analysis for Tactile Sensing (FEATS)☆27Oct 1, 2025Updated 7 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code implementation of R^2-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning☆22Jul 8, 2024Updated last year
- This repo contains the code for studying the interplay between quantization and sparsity methods☆26Feb 26, 2025Updated last year
- [ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"☆87Nov 28, 2023Updated 2 years ago
- Code for Fast Propagation is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks (TIFS2024)☆13Mar 29, 2024Updated 2 years ago
- 关于behance爬虫项目☆10May 16, 2019Updated 7 years ago
- Files related to a sample video created to help .Net developers to publish a simple .Net web app to Linux server☆21Aug 10, 2023Updated 2 years ago
- [KDD'21] Official PyTorch implementation for "Data Poisoning Attack against Recommender System Using Incomplete and Perturbed Data".☆13Sep 19, 2021Updated 4 years ago