eth-sri / ToolFuzzLinks
ToolFuzz is a fuzzing framework designed to test your LLM Agent tools.
☆28Updated 2 months ago
Alternatives and similar repositories for ToolFuzz
Users that are interested in ToolFuzz are comparing it to the libraries listed below
Sorting:
- A better way of testing, inspecting, and analyzing AI Agent traces.☆40Updated this week
- TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…☆66Updated 3 weeks ago
- Guardrails for secure and robust agent development☆346Updated 2 months ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆55Updated 6 months ago
- Visualize any repo or codebase into diagram or animation☆20Updated 11 months ago
- The Granite Guardian models are designed to detect risks in prompts and responses.☆116Updated last week
- ☆59Updated 2 weeks ago
- ☆31Updated 6 months ago
- [NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.☆170Updated 5 months ago
- Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs☆91Updated 9 months ago
- ☆46Updated last year
- Official implementation of the WASP web agent security benchmark☆48Updated last month
- ☆100Updated last year
- A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.☆80Updated last year
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆275Updated 3 weeks ago
- A prompt defence is a multi-layer defence that can be used to protect your applications against prompt injection attacks.☆18Updated 11 months ago
- Let Claude control a web browser on your machine.☆36Updated 3 months ago
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆70Updated 2 months ago
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆165Updated last year
- Enhancing AI Software Engineering with Repository-level Code Graph☆215Updated 5 months ago
- Repo2Run is an LLM-based agent that automates environment configuration by generating error-free Dockerfiles for Python repositories.☆52Updated last month
- ☆274Updated 2 months ago
- AgentFence is an open-source platform for automatically testing AI agent security. It identifies vulnerabilities such as prompt injection…☆25Updated 6 months ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]☆346Updated 8 months ago
- A framework for building large-scale, deterministic, interactive workflows with a fault-tolerant, conversational UX☆37Updated this week
- [NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents☆48Updated 2 months ago
- ☆33Updated 4 months ago
- Code for the paper "Coding Agents with Multimodal Browsing are Generalist Problem Solvers"☆85Updated 2 weeks ago
- This repository provides a benchmark for prompt Injection attacks and defenses☆288Updated 2 months ago
- Source code for paper: INTERVENOR : Prompt the Coding Ability of Large Language Models with the Interactive Chain of Repairing☆26Updated 10 months ago