facebookresearch/SecAlign

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/facebookresearch/SecAlign)

facebookresearch / SecAlign

Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"

☆98

Alternatives and similar repositories for SecAlign

Users that are interested in SecAlign are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Sizhe-Chen / StruQ
View on GitHub
official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries
☆76Nov 10, 2025Updated 8 months ago
facebookresearch / Meta_SecAlign
View on GitHub
Repo for the paper "Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks".
☆70Jun 11, 2026Updated last month
Sizhe-Chen / DefensiveToken
View on GitHub
Repo for the paper "Defending Against Prompt Injection With a Few DefensiveTokens"
☆19Nov 10, 2025Updated 8 months ago
ethz-spylab / agentdojo
View on GitHub
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
☆670Jun 2, 2026Updated last month
yizhu-joy / DataFilter
View on GitHub
☆15Nov 29, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
microsoft / BIPIA
View on GitHub
A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
☆147Apr 15, 2024Updated 2 years ago
liu00222 / Open-Prompt-Injection
View on GitHub
This repository provides a benchmark for prompt injection attacks and defenses in LLMs
☆465Oct 29, 2025Updated 8 months ago
facebookresearch / rl-injector
View on GitHub
Official release of code for the paper RL is a hammer and LLMs are nails A simple RL approach to stronger prompt injection attacks
☆53May 6, 2026Updated 2 months ago
microsoft / TaskTracker
View on GitHub
TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…
☆92Sep 1, 2025Updated 10 months ago
lapisrocks / rpo
View on GitHub
Official repository for "Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks"
☆62Aug 8, 2024Updated last year
fzwark / Secure_LLM_System
View on GitHub
☆16Mar 9, 2025Updated last year
facebookresearch / prompt-siren
View on GitHub
A research workbench for developing and testing attacks against large language models, with a focus on prompt injection vulnerabilities a…
☆54Updated this week
facebookresearch / wasp
View on GitHub
Official implementation of the WASP web agent security benchmark
☆98Apr 13, 2026Updated 3 months ago
pasquini-dario / LLM_NeuralExec
View on GitHub
Code to generate NeuralExecs (prompt injection for LLMs)
☆27Oct 5, 2025Updated 9 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
lancopku / agent-backdoor-attacks
View on GitHub
Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]
☆115Sep 27, 2024Updated last year
facebookresearch / multimodal-fusion-jailbreaks
View on GitHub
Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)
☆20Oct 22, 2024Updated last year
facebookresearch / advprompter
View on GitHub
Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873
☆183May 6, 2024Updated 2 years ago
SaFo-Lab / AGrail4Agent
View on GitHub
[ACL 2025] The official code for "AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection".
☆42Aug 4, 2025Updated 11 months ago
ethz-spylab / superhuman-ai-consistency
View on GitHub
☆30Jun 19, 2023Updated 3 years ago
SaFo-Lab / Awesome-T2I-safety-Papers
View on GitHub
List of T2I safety papers, updated daily, welcome to discuss using Discussions
☆68Aug 12, 2024Updated last year
uiuc-kang-lab / InjecAgent
View on GitHub
☆152Jul 2, 2024Updated 2 years ago
ChenWu98 / agent-attack
View on GitHub
[ICLR 2025] Dissecting adversarial robustness of multimodal language model agents
☆139Feb 19, 2025Updated last year
OSU-NLP-Group / EIA_against_webagent
View on GitHub
☆40Oct 2, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Yu-Fangxu / COLD-Attack
View on GitHub
[ICML 2024] COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
☆176Dec 18, 2024Updated last year
Greysahy / ipiguard
View on GitHub
[EMNLP 2025 Oral] IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents
☆22Sep 16, 2025Updated 10 months ago
sunblaze-ucb / progent
View on GitHub
Progent: Securing AI Agents with Privilege Control
☆41May 14, 2026Updated 2 months ago
SchwinnL / LLM_Embedding_Attack
View on GitHub
Code to conduct an embedding attack on LLMs
☆33Jan 10, 2025Updated last year
leolee99 / PIGuard
View on GitHub
[ACL 2025] The official implementation of the paper "PIGuard: Prompt Injection Guardrail via Mitigating Overdefense for Free".
☆79Dec 4, 2025Updated 7 months ago
MarkGHX / BiScope
View on GitHub
Official Implementation of NeurIPS 2024 paper - BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens
☆29Feb 17, 2026Updated 5 months ago
SaFo-Lab / ReasoningBomb
View on GitHub
[CCS 2026] The official implementation of our CCS 2026 paper "ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathological…
☆15Jun 24, 2026Updated 3 weeks ago
microsoft / fides
View on GitHub
Flow Integrity Deterministic Enforcement System. Mechanisms for securing AI agents with information-flow control.
☆110May 30, 2025Updated last year
PurCL / ASTRA
View on GitHub
🥇 Amazon Nova AI Challenge Winner - ASTRA emerged victorious as the top attacking team in Amazon's global AI safety competition, defeati…
☆75May 11, 2026Updated 2 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
RPC2 / AutoInject
View on GitHub
☆20Jun 12, 2026Updated last month
SaFo-Lab / DRIFT
View on GitHub
[NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agen…
☆58Updated this week
google-research / camel-prompt-injection
View on GitHub
Code for the paper "Defeating Prompt Injections by Design"
☆356Jun 20, 2025Updated last year
uw-nsl / SafeDecoding
View on GitHub
Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
☆154Jul 19, 2024Updated 2 years ago
hwanchang00 / ChatInject
View on GitHub
[ICLR 2026] Official implementation of "ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents"
☆17Mar 23, 2026Updated 3 months ago
PurCL / ProSec
View on GitHub
Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"
☆18Feb 26, 2026Updated 4 months ago
usnistgov / agentdojo-inspect
View on GitHub
A fork of AgentDojo compatible with Inspect.
☆16Oct 23, 2025Updated 8 months ago