[ACL 2025] The official implementation of the paper "PIGuard: Prompt Injection Guardrail via Mitigating Overdefense for Free".
☆76Dec 4, 2025Updated 5 months ago
Alternatives and similar repositories for PIGuard
Users that are interested in PIGuard are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The code of the paper of "A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval" accepted b…☆19Jan 16, 2024Updated 2 years ago
- [NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agen…☆48Apr 19, 2026Updated 2 weeks ago
- A simple pytorch implementation of baseline based-on CLIP for Image-text Matching.☆19May 25, 2023Updated 2 years ago
- The official implementation of the paper "AgentDyn: A Dynamic Open-Ended Benchmark for Evaluating Prompt Injection Attacks of Real-World …☆48Apr 19, 2026Updated 2 weeks ago
- [NeurIPS 2023] The official implementation of paper "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval" acce…☆28May 14, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.☆119Apr 15, 2024Updated 2 years ago
- The code of the paper "Negative Pre-aware for Noisy Cross-modal Matching" in AAAI 2024.☆31Jul 2, 2025Updated 10 months ago
- The code of "Image-text Retrieval via Preserving Main Semantic of Vision" in ICME 2023.☆15Dec 25, 2023Updated 2 years ago
- Code for our NAACL2025 accepted paper: Attention Tracker: Detecting Prompt Injection Attacks in LLMs☆23Sep 19, 2025Updated 7 months ago
- [ACL 2025] The official code for "AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection".☆40Aug 4, 2025Updated 9 months ago
- Every practical and proposed defense against prompt injection.☆681Feb 22, 2025Updated last year
- [TIP 2022] Official code of paper “Video Question Answering with Prior Knowledge and Object-sensitive Learning”☆46Jan 27, 2024Updated 2 years ago
- IEEE 1588 Precision Time Protocol Simulation☆10May 6, 2019Updated 6 years ago
- This repository provides a benchmark for prompt injection attacks and defenses in LLMs☆434Oct 29, 2025Updated 6 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆18Mar 30, 2025Updated last year
- Röttger et al. (2025): "MSTS: A Multimodal Safety Test Suite for Vision-Language Models"☆17Mar 31, 2025Updated last year
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆96Apr 14, 2026Updated 3 weeks ago
- [ArXiv 2025] Denial-of-Service Poisoning Attacks on Large Language Models☆23Oct 22, 2024Updated last year
- ☆12Sep 29, 2024Updated last year
- [ICML 2025] UDora: A Unified Red Teaming Framework against LLM Agents☆33Jun 24, 2025Updated 10 months ago
- ☆25Jan 17, 2025Updated last year
- Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP☆11Jan 29, 2024Updated 2 years ago
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆72May 22, 2025Updated 11 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Detection of malicious prompts used to exploit large language models (LLMs) by leveraging supervised machine learning classifiers.☆20Oct 30, 2024Updated last year
- [CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment☆28Jun 11, 2025Updated 10 months ago
- Towards Safe LLM with our simple-yet-highly-effective Intention Analysis Prompting☆21Mar 25, 2024Updated 2 years ago
- Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs☆121Dec 2, 2024Updated last year
- A Python library for guardrail models evaluation.☆35Oct 9, 2025Updated 6 months ago
- Code and data to go with the Zhu et al. paper "An Objective for Nuanced LLM Jailbreaks"☆36Apr 8, 2026Updated 3 weeks ago
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆70Oct 23, 2024Updated last year
- The most comprehensive and accurate LLM jailbreak attack benchmark by far☆21Mar 22, 2025Updated last year
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Implementation of paper 'Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing'☆24Jun 9, 2024Updated last year
- ☆69Sep 30, 2025Updated 7 months ago
- ☆33Sep 11, 2025Updated 7 months ago
- AAAI '25. Retrieval-Augmented Multimodal Social Media Popularity Prediction☆22Apr 27, 2026Updated last week
- Implementation for "RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content"☆24Jul 28, 2024Updated last year
- [AAAI26] Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilitie…☆10Feb 7, 2026Updated 2 months ago
- Convert bodies of text to IPA translations☆12May 2, 2023Updated 3 years ago