AI-secure/PolyGuard

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AI-secure/PolyGuard)

AI-secure / PolyGuard

☆23

Alternatives and similar repositories for PolyGuard

Users that are interested in PolyGuard are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Open-Agent-Safety / OpenAgentSafety
View on GitHub
Evaluating Agent Safety in Realistic, High-Risk Simulations
☆31Jul 6, 2026Updated 2 weeks ago
AI-secure / UDora
View on GitHub
[ICML 2025] UDora: A Unified Red Teaming Framework against LLM Agents
☆37Jun 24, 2025Updated last year
xjzzzzzzzz / MCPSafety
View on GitHub
☆22Dec 18, 2025Updated 7 months ago
AI-secure / MMDT
View on GitHub
Comprehensive Assessment of Trustworthiness in Multimodal Foundation Models
☆29Mar 15, 2025Updated last year
paul-rottger / msts-multimodal-safety
View on GitHub
Röttger et al. (2025): "MSTS: A Multimodal Safety Test Suite for Vision-Language Models"
☆20Mar 31, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
purpcode-uiuc / purpcode
View on GitHub
🔮Reasoning for Safer Code Generation; 🥇Winner Solution of Amazon Nova AI Challenge 2025
☆40Aug 24, 2025Updated 10 months ago
shaoshuo-ss / LeaFBench
View on GitHub
Official code for our paper "SoK: Large Language Model Copyright Auditing via Fingerprinting"
☆18Dec 31, 2025Updated 6 months ago
Kordi-AI / Multi-User-LLM-Agent
View on GitHub
Official code for the paper: "Multi-User Large Language Model Agents"
☆27May 11, 2026Updated 2 months ago
agiresearch / ASB
View on GitHub
Agent Security Bench (ASB)
☆272Apr 16, 2026Updated 3 months ago
WhitolfChen / SCAR
View on GitHub
[NeurIPS 2025] Taught Well Learned Ill: Towards Distillation-conditional Backdoor Attack
☆15Nov 19, 2025Updated 8 months ago
IMYangJinheng / DeepCFD-for-Prediction-of-flow-field-in-Laval-nozzle
View on GitHub
This is a U-Net-based deep learning model, which we call DeepCFD. You can use this model to predict the temperature, velocity, and pressu…
☆15Sep 30, 2025Updated 9 months ago
AI-secure / DecodingTrust-Agent
View on GitHub
☆68Jun 18, 2026Updated last month
OSU-NLP-Group / RedTeamCUA
View on GitHub
[ICLR'26 Oral] RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments
☆57Feb 9, 2026Updated 5 months ago
xiaoyuxin1002 / UQ-PLM
View on GitHub
Uncertainty Quantification with Pre-trained Language Models: An Empirical Analysis
☆15Oct 11, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
HKUST-KnowComp / PrivLM-Bench
View on GitHub
Code for ACL 2024 paper: PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models.
☆16Feb 5, 2025Updated last year
AI-secure / RedCode
View on GitHub
[NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents
☆85Apr 24, 2026Updated 2 months ago
HanjiangHu / NBF-LLM
View on GitHub
The official code for "Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks".
☆18Jun 24, 2026Updated 3 weeks ago
reds-lab / Meta-Sift
View on GitHub
The official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on …
☆20Apr 27, 2023Updated 3 years ago
UKPLab / emnlp2024-code-prompting
View on GitHub
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs. EMNLP 2024
☆27Nov 13, 2024Updated last year
BillChan226 / SafeWatch
View on GitHub
[ICLR 2025] Official implementation for "SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanati…
☆45Feb 11, 2025Updated last year
jianshuod / SafeSearch
View on GitHub
[ICML 2026] Official implementations of ``SafeSearch: Automated Red-Teaming of LLM-Based Search Agents''
☆19Mar 25, 2026Updated 3 months ago
enguard-ai / awesome-ai-guardrails
View on GitHub
A curated list of materials on AI guardrails
☆60Jun 22, 2026Updated last month
bigglesworthnotacat / LLM-Steg
View on GitHub
[ICLR 2026 Oral] Invisible Safety Threat: Malicious Finetuning for LLM via Steganography
☆20Mar 22, 2026Updated 3 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
spring-epfl / WebGraph
View on GitHub
☆15Jul 30, 2024Updated last year
ShenzheZhu / JailDAM
View on GitHub
[COLM 2025] JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model
☆26Nov 25, 2025Updated 7 months ago
albert-y1n / PISmith
View on GitHub
PISmith: Reinforcement Learning-based Red Teaming for Prompt Injection Defenses
☆21Updated this week
HowieHwong / Agentic-Guardian
View on GitHub
[ICLR'26] Building a Foundational Guardrail for General Agentic Systems via Synthetic Data
☆48Oct 26, 2025Updated 8 months ago
sdbds / florence2-ft-advanced
View on GitHub
finetune your florence2 model easy
☆21Jul 27, 2024Updated last year
PurCL / ProSec
View on GitHub
Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"
☆18Feb 26, 2026Updated 4 months ago
Lancelot-Xie / MAG-SQL
View on GitHub
MAG-SQL: Multi-Agent Generative Approach with Soft Schema Linking and Iterative Sub-SQL Refinement for Text-to-SQL
☆19Jul 10, 2025Updated last year
zzh-thu-22 / ExtendAttack
View on GitHub
[AAAI 2026] This is the official implementation of the paper "ExtendAttack: Attacking Servers of LRMs via Extending Reasoning".
☆25Mar 18, 2026Updated 4 months ago
MercuryDemo / NFT-Auction
View on GitHub
浙江大学区块链大作业-去中心化NFT拍卖平台(前端完善版）
☆10Dec 27, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
SaFo-Lab / DoxBench
View on GitHub
[ICLR 2026] The official code for "Doxing via the Lens: Revealing Location-related Privacy Leakage on Multi-modal Large Reasoning Models"
☆30Feb 7, 2026Updated 5 months ago
Yunhao-Feng / BackdoorAgent
View on GitHub
BackdoorAgent is a stage-aware framework and benchmark that instruments LLM-agent workflows (planning, memory, tools) to systematically i…
☆42Mar 16, 2026Updated 4 months ago
MurrayTom / ToolSafe
View on GitHub
Official Implementation of "ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedbac…
☆71Mar 25, 2026Updated 3 months ago
AI-secure / semantic-randomized-smoothing
View on GitHub
[CCS 2021] TSS: Transformation-specific smoothing for robustness certification
☆26Oct 3, 2023Updated 2 years ago
ucsb-mlsec / SeCodePLT
View on GitHub
☆15Sep 24, 2025Updated 9 months ago
HKUST-KnowComp / PrivaCI-Bench
View on GitHub
☆23Apr 23, 2025Updated last year
shaoshuo-ss / Awesome-LLM-Fingerprinting
View on GitHub
Paper list of LLM fingerprinting, based on our paper titled "SoK: Large Language Model Copyright Auditing via Fingerprinting".
☆29Aug 28, 2025Updated 10 months ago