eth-sri / ToolFuzzLinks

ToolFuzz is a fuzzing framework designed to test your LLM Agent tools.

☆32

Alternatives and similar repositories for ToolFuzz

Users that are interested in ToolFuzz are comparing it to the libraries listed below

Sorting:

invariantlabs-ai / explorer
A better way of testing, inspecting, and analyzing AI Agent traces.
☆40Updated last month
egozverev / Should-It-Be-Executed-Or-Processed
Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.
☆56Updated 8 months ago
microsoft / TaskTracker
TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…
☆74Updated 2 months ago
invariantlabs-ai / playwright-computer-use
Let Claude control a web browser on your machine.
☆39Updated 5 months ago
fangyuan-ksgk / repo-viewer
Visualize any repo or codebase into diagram or animation
☆20Updated last year
invariantlabs-ai / invariant
Guardrails for secure and robust agent development
☆364Updated 3 months ago
OPTML-Group / Unlearn-Trace
Unlearning Isn't Invisible: Detecting Unlearning Traces in LLMs from Model Outputs
☆23Updated 4 months ago
microsoft / agdebugger
☆62Updated this week
GAIR-NLP / SR-Scientist
SR-Scientist: Scientific Equation Discovery With Agentic AI
☆25Updated 2 weeks ago
githubnext / llmorpheus
LLM-based mutation testing
☆11Updated 9 months ago
Safetorun / PromptDefender
A prompt defence is a multi-layer defence that can be used to protect your applications against prompt injection attacks.
☆19Updated last year
ibm-granite / granite-guardian
The Granite Guardian models are designed to detect risks in prompts and responses.
☆120Updated last month
microsoft / promptpex
Test Generation for Prompts
☆143Updated this week
DeepSoftwareAnalytics / Awesome-Agent4SE
☆102Updated last year
bigcode-project / bigcodearena
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution
☆52Updated last month
mahmoudrabie / agentic-ai
This repository serves as a comprehensive knowledge hub, curating cutting-edge research papers and developments across 25+ specialized do…
☆80Updated last week
microsoft / CodePlan
Data and evaluation scripts for "CodePlan: Repository-level Coding using LLMs and Planning", FSE 2024
☆76Updated last year
SalesforceAIResearch / CRMArena
Official Repo for CRMArena and CRMArena-Pro
☆125Updated last week
codegen-sh / deep-research
Codebase exploration with AI research agents
☆18Updated 8 months ago
haizelabs / get-haized
A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.
☆98Updated 7 months ago
ozyyshr / RepoGraph
Enhancing AI Software Engineering with Repository-level Code Graph
☆225Updated 7 months ago
goncalorafaria / qalign
QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.
☆24Updated last week
radiantlogicinc / fastworkflow
A framework for building large-scale, deterministic, interactive workflows with a fault-tolerant, conversational UX
☆42Updated this week
zikuicai / aegisllm
☆33Updated last month
NaturalAgents / NaturalAgents
Easiest way to build custom agents, in a no-code notion style editor, using simple macros.
☆35Updated last year
facebookresearch / wasp
Official implementation of the WASP web agent security benchmark
☆53Updated 3 months ago
Columbia-NLP-Lab / PAPILLON
Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles
☆60Updated 6 months ago
ryoungj / ToolEmu
[ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use
☆172Updated last year
RobustNLP / DeRTa
A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.
☆71Updated 5 months ago
MinorJerry / OpenWebVoyager
☆84Updated last year