eth-sri / ToolFuzzLinks
ToolFuzz is a fuzzing framework designed to test your LLM Agent tools.
☆32Updated 4 months ago
Alternatives and similar repositories for ToolFuzz
Users that are interested in ToolFuzz are comparing it to the libraries listed below
Sorting:
- A better way of testing, inspecting, and analyzing AI Agent traces.☆40Updated last month
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆56Updated 8 months ago
- TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…☆74Updated 2 months ago
- Let Claude control a web browser on your machine.☆39Updated 5 months ago
- Visualize any repo or codebase into diagram or animation☆20Updated last year
- Guardrails for secure and robust agent development☆364Updated 3 months ago
- Unlearning Isn't Invisible: Detecting Unlearning Traces in LLMs from Model Outputs☆23Updated 4 months ago
- ☆62Updated this week
- SR-Scientist: Scientific Equation Discovery With Agentic AI☆25Updated 2 weeks ago
- LLM-based mutation testing☆11Updated 9 months ago
- A prompt defence is a multi-layer defence that can be used to protect your applications against prompt injection attacks.☆19Updated last year
- The Granite Guardian models are designed to detect risks in prompts and responses.☆120Updated last month
- Test Generation for Prompts☆143Updated this week
- ☆102Updated last year
- BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution☆52Updated last month
- This repository serves as a comprehensive knowledge hub, curating cutting-edge research papers and developments across 25+ specialized do…☆80Updated last week
- Data and evaluation scripts for "CodePlan: Repository-level Coding using LLMs and Planning", FSE 2024☆76Updated last year
- Official Repo for CRMArena and CRMArena-Pro☆125Updated last week
- Codebase exploration with AI research agents☆18Updated 8 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆98Updated 7 months ago
- Enhancing AI Software Engineering with Repository-level Code Graph☆225Updated 7 months ago
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆24Updated last week
- A framework for building large-scale, deterministic, interactive workflows with a fault-tolerant, conversational UX☆42Updated this week
- ☆33Updated last month
- Easiest way to build custom agents, in a no-code notion style editor, using simple macros.☆35Updated last year
- Official implementation of the WASP web agent security benchmark☆53Updated 3 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆60Updated 6 months ago
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆172Updated last year
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆71Updated 5 months ago
- ☆84Updated last year