☆68Sep 30, 2025Updated 5 months ago
Alternatives and similar repositories for LlavaGuard
Users that are interested in LlavaGuard are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Röttger et al. (2025): "MSTS: A Multimodal Safety Test Suite for Vision-Language Models"☆16Mar 31, 2025Updated 11 months ago
- ☆35May 22, 2024Updated last year
- Divide-and-Conquer Attack: Harnessing the Power of LLM to Bypass the Censorship of Text-to-Image Generation Mode☆18Feb 16, 2025Updated last year
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆57Jul 21, 2025Updated 8 months ago
- [CVPR 2025] Official implementation for "Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbre…☆55Jul 5, 2025Updated 8 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Fine-tuning-free Shapley value (FreeShap) for instance attribution☆14May 29, 2024Updated last year
- [COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and fur…☆88May 9, 2025Updated 10 months ago
- Evaluating Durability: Benchmark Insights into Multimodal Watermarking☆12Jun 7, 2024Updated last year
- ☆23May 28, 2025Updated 9 months ago
- [COLM 2025] JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model☆27Nov 25, 2025Updated 4 months ago
- [CVPR 2025] Official implementation for JOOD "Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy"☆22Jun 11, 2025Updated 9 months ago
- [CVPR2025] T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation☆33Jul 10, 2025Updated 8 months ago
- [ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shi…☆73Feb 9, 2026Updated last month
- [NeurIPS 2023] Official PyTorch implementation for the paper "CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image Steganog…☆11Sep 28, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…☆80Jun 6, 2024Updated last year
- ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation. AAAI, 2025☆13Aug 25, 2025Updated 7 months ago
- [IROS'25] COCMT☆12Aug 14, 2025Updated 7 months ago
- [ICLR 2025] PyTorch Implementation of "ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time"☆30Jul 20, 2025Updated 8 months ago
- ☆14Jul 2, 2024Updated last year
- ☆157Aug 9, 2022Updated 3 years ago
- ☆18Apr 7, 2025Updated 11 months ago
- ☆76Mar 30, 2025Updated 11 months ago
- Code for paper OmniSSR☆24Apr 21, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official eval code for ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation☆27Dec 12, 2025Updated 3 months ago
- [EMNLP'25] A novel alignment framework that leverages image retrieval to mitigate hallucinations in Vision Language Models.☆50Aug 21, 2025Updated 7 months ago
- Official Implementation of Safe Latent Diffusion for Text2Image☆95Apr 21, 2023Updated 2 years ago
- This is the official repository for paper: cross-modal information flow in multimodal large language models☆42May 21, 2025Updated 10 months ago
- [NeurIPS 2024 Oral] "Bayesian-Guided Label Mapping for Visual Reprogramming"☆12Dec 20, 2024Updated last year
- [ACL 25] SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities☆29Apr 2, 2025Updated 11 months ago
- ☆31Jul 18, 2024Updated last year
- Experiments for our CLEAR benchmark of unlearning methods in a multimodal setup☆20Aug 6, 2025Updated 7 months ago
- (NeurIPS 2025) Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"☆47Jun 3, 2025Updated 9 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [TMLR'25] AutoTrust, a groundbreaking benchmark designed to assess the trustworthiness of DriveVLMs. This work aims to enhance public saf…☆53Nov 20, 2025Updated 4 months ago
- [ACL 2025] The official implementation of the paper "PIGuard: Prompt Injection Guardrail via Mitigating Overdefense for Free".☆63Dec 4, 2025Updated 3 months ago
- Official implementation of "Data Mixture Inference: What do BPE tokenizers reveal about their training data?"☆18May 15, 2025Updated 10 months ago
- Consuming Resrouce via Auto-generation for LLM-DoS Attack under Black-box Settings☆19Sep 1, 2025Updated 6 months ago
- ☆19May 28, 2025Updated 9 months ago
- Official codes for FPR (Accepted by CVPR2025)☆14Mar 19, 2025Updated last year
- ☆10Oct 31, 2022Updated 3 years ago