Repo for the paper "Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks".
☆67Jun 3, 2026Updated this week
Alternatives and similar repositories for Meta_SecAlign
Users that are interested in Meta_SecAlign are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official implementation of the WASP web agent security benchmark☆88Apr 13, 2026Updated last month
- official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries☆75Nov 10, 2025Updated 6 months ago
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆96May 6, 2026Updated last month
- Implementation for paper Automata Extraction from Transformers.☆12Jun 8, 2024Updated 2 years ago
- Official implementation repository for the paper Towards General Conceptual Model Editing via Adversarial Representation Engineering.☆20Dec 6, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆56Mar 18, 2026Updated 2 months ago
- Website & Documentation: https://sbaresearch.github.io/model-watermarking/☆25Sep 22, 2023Updated 2 years ago
- ☆50Jul 19, 2025Updated 10 months ago
- Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873☆181May 6, 2024Updated 2 years ago
- ☆12Feb 19, 2024Updated 2 years ago
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆598Jun 2, 2026Updated last week
- Adversarial Examples Detection Benchmark☆16Dec 6, 2024Updated last year
- [NeurIPS'24] Protecting Your LLMs with Information Bottleneck☆27Nov 7, 2024Updated last year
- Official codes of KDD'24 paper "HiFGL: A Hierarchical Framework for Cross-silo Cross-device Federated Graph Learning"☆10Sep 4, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Awesome Jailbreak, red teaming arxiv papers (Automatically Update Every 12th hours)☆110Jun 1, 2026Updated last week
- [ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.☆89Jan 19, 2025Updated last year
- Official Implementation of "ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedbac…☆63Mar 25, 2026Updated 2 months ago
- WAFFLE: Watermarking in Federated Learning☆23Aug 21, 2023Updated 2 years ago
- ☆52May 24, 2023Updated 3 years ago
- ☆33Jan 26, 2025Updated last year
- ☆40Oct 2, 2024Updated last year
- ☆20Feb 3, 2025Updated last year
- Implementation of IEEE TNNLS 2023 and Elsevier PR 2023 papers on backdoor watermarking for deep classification models with unambiguity an…☆19Jul 27, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Documenting large text datasets 🖼️ 📚☆14Dec 17, 2024Updated last year
- ☆24Apr 14, 2019Updated 7 years ago
- 🧨 TradeTrap: Are LLM-based Trading Agents Truly Reliable and Faithful?☆79Nov 27, 2025Updated 6 months ago
- [ICLR'26 Oral] RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments☆55Feb 9, 2026Updated 4 months ago
- ☆15Feb 11, 2025Updated last year
- web application, powered by Python Flask and OpenAI GPT-3, designed to generate exceptional AI-generated content for a wide range of appl…☆14Feb 7, 2023Updated 3 years ago
- Single-user Matrix.org Application Service (AS) to bridge SMSes to the Matrix network!☆12Jul 10, 2018Updated 7 years ago
- ☆11Dec 8, 2022Updated 3 years ago
- The official implementation of CVPR 2021 paper "Simulating Unknown Target Models for Query-Efficient Black-box Attacks"☆59Jun 18, 2021Updated 4 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Metaskill: A Meta-Skill for Autonomous AI Agent Team Generation☆50Feb 23, 2026Updated 3 months ago
- ☆27May 20, 2026Updated 2 weeks ago
- 记录CS231n的学习笔记以及作业解答☆24May 11, 2019Updated 7 years ago
- Robustness for Non-Parametric Classification: A Generic Attack and Defense☆18Nov 21, 2022Updated 3 years ago
- ☆141Jul 2, 2024Updated last year
- Repo for arXiv preprint "Gradient-based Adversarial Attacks against Text Transformers"☆111Dec 28, 2022Updated 3 years ago
- ☆10Jun 1, 2022Updated 4 years ago