☆23Oct 25, 2024Updated last year
Alternatives and similar repositories for AgentAttack
Users that are interested in AgentAttack are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆182Oct 31, 2025Updated 5 months ago
- ☆31Feb 27, 2025Updated last year
- ☆37Oct 2, 2024Updated last year
- ☆13Nov 17, 2024Updated last year
- ☆129Jul 2, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models☆250Jan 27, 2026Updated 2 months ago
- [ICLR 2025] Dissecting adversarial robustness of multimodal language model agents☆136Feb 19, 2025Updated last year
- This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…☆31Jun 1, 2024Updated last year
- [CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment☆27Jun 11, 2025Updated 10 months ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆97May 23, 2024Updated last year
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- Official code for FAccT'21 paper "Fairness Through Robustness: Investigating Robustness Disparity in Deep Learning" https://arxiv.org/abs…☆13Mar 9, 2021Updated 5 years ago
- ☆77Mar 13, 2026Updated 3 weeks ago
- AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM☆86Nov 3, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Code for our paper "Defending ChatGPT against Jailbreak Attack via Self-Reminder" in NMI.☆57Nov 13, 2023Updated 2 years ago
- ☆21May 23, 2025Updated 10 months ago
- Pytorch implementation of NPAttack☆12Jul 7, 2020Updated 5 years ago
- Official repository for "Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks"☆62Aug 8, 2024Updated last year
- Agent Security Bench (ASB)☆214Oct 27, 2025Updated 5 months ago
- An Autonomous Curriculum Reinforcement Learning framework that steers agents to continually learn in specific environments with zero huma…☆26Feb 25, 2026Updated last month
- Kubernetes cli (kubectl) powered by GPT☆15Apr 20, 2023Updated 2 years ago
- True Few-Shot BioIE: Benchmarking GPT-3 In-Context and Small PLM Fine-Tuning☆12Jul 6, 2022Updated 3 years ago
- This repository contains data and code used for On the Risk of Misinformation Pollution with Large Language Models (EMNLP 2023 Findings).☆16Dec 14, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆18Jan 3, 2025Updated last year
- ☆11Oct 18, 2022Updated 3 years ago
- Spectral Graph Attention Network with Fast Eigen-approximation☆11Dec 24, 2021Updated 4 years ago
- ☆73Feb 16, 2025Updated last year
- clock_plot provides a simple way to visualize timeseries data, mapping 24 hours onto the 360 degrees of a polar plot☆15Apr 5, 2022Updated 4 years ago
- [NeurIPS 2024] Fight Back Against Jailbreaking via Prompt Adversarial Tuning☆11Oct 29, 2024Updated last year
- Code implementation of R^2-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning☆22Jul 8, 2024Updated last year
- The MobSTr dataset provides artifacts that demonstrate Model-based Safety Assurance and Traceability for a safety-critical automotive sys…☆10Mar 18, 2022Updated 4 years ago
- Bridging the Generalization Gap in Text-to-SQL Parsing with Schema Expansion☆14Jul 26, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Code for Voice Jailbreak Attacks Against GPT-4o.☆38May 31, 2024Updated last year
- ☆12Apr 27, 2022Updated 3 years ago
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆198Mar 22, 2024Updated 2 years ago
- ☆15Apr 6, 2020Updated 6 years ago
- Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding☆152Jul 19, 2024Updated last year
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆211Apr 12, 2025Updated last year
- Author implementation of the paper "Decoupling Structure and Lexicon for Zero-Shot Semantic Parsing"☆18Nov 2, 2018Updated 7 years ago