IntelLabs / LLMartLinks

LLM Adversarial Robustness Toolkit, a toolkit for evaluating LLM robustness through adversarial testing.

☆33

Alternatives and similar repositories for LLMart

Users that are interested in LLMart are comparing it to the libraries listed below

Sorting:

mlcommons / modelbench
Run safety benchmarks against AI models and view detailed reports showing how well they performed.
☆94Updated this week
ethz-spylab / agentdojo
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.
☆188Updated last week
microsoft / DPSDA
Private Evolution: Generating DP Synthetic Data without Training [ICLR 2024, ICML 2024 Spotlight]
☆97Updated 3 weeks ago
haizelabs / redteaming-resistance-benchmark
☆45Updated 10 months ago
microsoft / dp-transformers
Differentially-private transformers using HuggingFace and Opacus
☆139Updated 10 months ago
andyzoujm / breaking-llama-guard
Code to break Llama Guard
☆31Updated last year
AI-secure / aug-pe
[ICML 2024 Spotlight] Differentially Private Synthetic Data via Foundation Model APIs 2: Text
☆40Updated 5 months ago
sigstore / model-transparency
Supply chain security for ML
☆167Updated last week
AI-secure / RedCode
[NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents
☆39Updated last month
allenai / wildguard
Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
☆82Updated 6 months ago
GraySwanAI / nanoGCG
A fast + lightweight implementation of the GCG algorithm in PyTorch
☆246Updated last month
arobey1 / smooth-llm
☆101Updated last year
chawins / pal
PAL: Proxy-Guided Black-Box Attack on Large Language Models
☆51Updated 10 months ago
andyzorigin / cybench
☆116Updated 2 weeks ago
ziqi-zhang / TAOISM
TAOISM: A TEE-based Confidential Heterogeneous Deployment Framework for DNN Models
☆39Updated last year
oneapi-src / predictive-asset-health-analytics
AI Starter Kit for Predictive Asset Maintenance using Intel® optimized version of XGBoost
☆1Updated last year
safr-ai-lab / survey-llm
A survey of privacy problems in Large Language Models (LLMs). Contains summary of the corresponding paper along with relevant code
☆67Updated last year
BHui97 / PLeak
☆58Updated 6 months ago
anthropics / sleeper-agents-paper
Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".
☆108Updated last year
rotaryhammer / code-autodan
An unofficial implementation of AutoDAN attack on LLMs (arXiv:2310.15140)
☆42Updated last year
AI-secure / DecodingTrust
A Comprehensive Assessment of Trustworthiness in GPT Models
☆294Updated 9 months ago
laude-institute / terminal-bench
A benchmark for LLMs on complicated tasks in the terminal
☆208Updated this week
GraySwanAI / circuit-breakers
Improving Alignment and Robustness with Circuit Breakers
☆214Updated 9 months ago
intel / confidential-computing-zoo
Confidential Computing Zoo provides confidential computing solutions based on Intel SGX, TDX, HEXL, etc. technologies.
☆328Updated this week
ThuCCSLab / JailbreakEval
[NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.
☆158Updated 2 months ago
ShanglunFengatETHZ / PrivacyBackdoor
Privacy backdoors
☆51Updated last year
facebookresearch / SecAlign
Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"
☆51Updated 2 months ago
securefederatedai / openfl
An Open Framework for Federated Learning.
☆788Updated this week
JailbreakBench / artifacts
Jailbreak artifacts for JailbreakBench
☆60Updated 7 months ago
bboylyg / BackdoorLLM
BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models
☆167Updated this week