sshh12 / llm_backdoorLinks

Experimental tools to backdoor large language models by re-writing their system prompts at a raw parameter level. This allows you to potentially execute offline remote code execution without running any actual code on the victim's machine or thwart LLM-based fraud/moderation systems.

☆186

Alternatives and similar repositories for llm_backdoor

Users that are interested in llm_backdoor are comparing it to the libraries listed below

Sorting:

StavC / Here-Comes-the-AI-Worm
Here Comes the AI Worm: Preventing the Propagation of Adversarial Self-Replicating Prompts Within GenAI Ecosystems
☆216Updated last month
pasquini-dario / project_mantis
Project Mantis: Hacking Back the AI-Hacker; Prompt Injection as a Defense Against LLM-driven Cyberattacks
☆88Updated 5 months ago
dreadnode / rigging
Lightweight LLM Interaction Framework
☆389Updated this week
Reapor-Yurnero / imprompter
Codebase of https://arxiv.org/abs/2410.14923
☆51Updated last year
BishopFox / BrokenHill
A productionized greedy coordinate gradient (GCG) attack tool for large language models (LLMs)
☆142Updated 10 months ago
user1342 / Oversight
A Completely Modular LLM Reverse Engineering, Red Teaming, and Vulnerability Research Framework.
☆52Updated 11 months ago
pdparchitect / llm-hacking-database
This repository contains various attack against Large Language Models.
☆115Updated last year
BishopFox / raink
Use LLMs for document ranking
☆151Updated 6 months ago
google-research / camel-prompt-injection
Code for the paper "Defeating Prompt Injections by Design"
☆138Updated 4 months ago
PalisadeResearch / llm-honeypot
☆45Updated last week
dreadnode / robopages
A YAML based format for describing tools to LLMs, like man pages but for robots!
☆78Updated 5 months ago
invariantlabs-ai / mcp-injection-experiments
Code snippets to reproduce MCP tool poisoning attacks.
☆183Updated 6 months ago
wunderwuzzi23 / scratch
Repo with random useful scripts, utilities, prompts and stuff
☆175Updated last week
PalisadeResearch / intercode
https://arxiv.org/abs/2412.02776
☆64Updated 10 months ago
dropbox / llm-security
Dropbox LLM Security research code and results
☆237Updated last year
wearetyomsmnv / Awesome-LLMSecOps
LLM | Security | Operations in one github repo with good links and pictures.
☆63Updated 9 months ago
westonbrown / Cyber-AutoAgent
AI agent for autonomous cyber operations
☆319Updated last week
dreadnode / tensor-man
A utility to inspect, validate, sign and verify machine learning model files.
☆59Updated 8 months ago
osgil-defense / TARS
Using Agents To Automate Pentesting
☆304Updated 9 months ago
haizelabs / dspy-redteam
Red-Teaming Language Models with DSPy
☆235Updated 8 months ago
ReversecLabs / spikee
☆91Updated last week
faizann24 / baby-naptime
A very simple open source implementation of Google's Project Naptime
☆172Updated 7 months ago
mbrg / genai-attacks
A knowledge source about TTPs used to target GenAI-based systems, copilots and agents
☆126Updated 3 weeks ago
GH05TCREW / PentestAgent
All-in-one offensive security toolbox with AI agent and MCP architecture. Integrates tools like Nmap, Metasploit, FFUF, SQLMap. Enables p…
☆455Updated 4 months ago
ibndias / CIPHER
Cybersecurity Intelligent Pentesting Helper for Ethical Researcher (CIPHER). Fine tuned LLM for penetration testing guidance based on wri…
☆32Updated 10 months ago
mazen160 / llmquery
Powerful LLM Query Framework with YAML Prompt Templates. Made for Automation
☆33Updated last month
arthurgervais / mapta
We present MAPTA, a multi-agent system for autonomous web application security assessment that combines large language model orchestratio…
☆69Updated 2 months ago
peluche / deck-of-many-prompts
Manual Prompt Injection / Red Teaming Tool
☆42Updated last year
dreadnode / dyana
A sandbox environment designed for loading, running and profiling a wide range of files, including machine learning models, ELFs, Pickle,…
☆331Updated last week
haizelabs / get-haized
A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.
☆97Updated 6 months ago