ltroin / llm_attack_defense_arena
☆66Updated 5 months ago
Related projects: ⓘ
- ☆12Updated 9 months ago
- [USENIX Security'24] Official repository of "Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise a…☆36Updated 3 weeks ago
- Repository for Towards Codable Watermarking for Large Language Models☆26Updated last year
- Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks (IEEE S&P 2024)☆29Updated 6 months ago
- Official Code for "Baseline Defenses for Adversarial Attacks Against Aligned Language Models"☆17Updated 10 months ago
- JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and further assess …☆29Updated 2 months ago
- Jailbreaking Large Vision-language Models via Typographic Visual Prompts☆76Updated 4 months ago
- Code to generate NeuralExecs (prompt injection for LLMs)☆14Updated last month
- Accepted by ECCV 2024☆59Updated 2 months ago
- A curated list of trustworthy Generative AI papers. Daily updating...☆67Updated 2 weeks ago
- The automated prompt injection framework for LLM-integrated applications.☆157Updated last week
- A toolbox for backdoor attacks.☆19Updated last year
- ☆19Updated 7 months ago
- ☆12Updated 5 months ago
- ☆10Updated 2 months ago
- ☆30Updated last month
- Code for paper "SrcMarker: Dual-Channel Source Code Watermarking via Scalable Code Transformations" (IEEE S&P 2024)☆16Updated last month
- ☆15Updated last year
- TAP: An automated jailbreaking method for black-box LLMs☆106Updated 6 months ago
- Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873☆110Updated 4 months ago
- ☆63Updated 10 months ago
- Towards Safe LLM with our simple-yet-highly-effective Intention Analysis Prompting☆10Updated 5 months ago
- Backdooring Neural Code Search☆12Updated last year
- [ACL2024-Main] Data and Code for WaterBench: Towards Holistic Evaluation of LLM Watermarks☆17Updated 10 months ago
- A lightweight library for large laguage model (LLM) jailbreaking defense.☆26Updated last month
- ☆207Updated 3 months ago
- 😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.☆73Updated this week
- This is the implementation for CVPR 2022 Oral paper "Better Trigger Inversion Optimization in Backdoor Scanning."☆23Updated 2 years ago
- [USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models☆61Updated last week
- ☆13Updated 2 years ago