egozverev/Should-It-Be-Executed-Or-Processed

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/egozverev/Should-It-Be-Executed-Or-Processed)

egozverev / Should-It-Be-Executed-Or-Processed

Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.

☆61

Alternatives and similar repositories for Should-It-Be-Executed-Or-Processed

Users that are interested in Should-It-Be-Executed-Or-Processed are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Zsbyqx20 / AgentHazard
View on GitHub
Mobile GUI Agents under Real-world Threats: Are We There Yet?
☆17May 18, 2026Updated 2 months ago
multiplexerai / mplx_rag
View on GitHub
Complex RAG backend
☆29Mar 28, 2024Updated 2 years ago
brendel-group / objects-compositional-generalization
View on GitHub
Official code for the paper "Provable Compositional Generalization for Object-Centric Learning" (ICLR 2024, oral)
☆16Aug 26, 2024Updated last year
sunblaze-ucb / progent
View on GitHub
Progent: Securing AI Agents with Privilege Control
☆41May 14, 2026Updated 2 months ago
cwhy / rwkv-decon
View on GitHub
Trying to deconstruct RWKV in understandable terms
☆14May 6, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
collinzrj / output2prompt
View on GitHub
☆59Mar 12, 2025Updated last year
strangeloopcanon / ParaLLM
View on GitHub
CLI that queries multiple language models in parallel using prompts from a CSV file
☆28Sep 24, 2025Updated 9 months ago
pasquini-dario / LLM_NeuralExec
View on GitHub
Code to generate NeuralExecs (prompt injection for LLMs)
☆27Oct 5, 2025Updated 9 months ago
LLM-QC / judgezoo
View on GitHub
A collection of judges for evaluating LLM model output for safety & toxicity with a standardized API.
☆15Jan 7, 2026Updated 6 months ago
Greysahy / ipiguard
View on GitHub
[EMNLP 2025 Oral] IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents
☆22Sep 16, 2025Updated 10 months ago
facebookresearch / prompt-siren
View on GitHub
A research workbench for developing and testing attacks against large language models, with a focus on prompt injection vulnerabilities a…
☆54Updated this week
ethz-spylab / jailbreak-tax
View on GitHub
☆24Feb 17, 2026Updated 5 months ago
datasette / datasette-scribe
View on GitHub
☆10Jun 23, 2026Updated 3 weeks ago
JonasGeiping / carving
View on GitHub
Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives
☆71Feb 22, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
microsoft / TaskTracker
View on GitHub
TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a…
☆92Sep 1, 2025Updated 10 months ago
Gananath / NERD
View on GitHub
Evolution of Discrete data with Reinforcement Learning
☆13Dec 8, 2019Updated 6 years ago
shuaizhao95 / ICLAttack
View on GitHub
ICL backdoor attack
☆17Nov 4, 2024Updated last year
safety-research / open-source-alignment-faking
View on GitHub
Open Source Replication of Anthropic's Alignment Faking Paper
☆58Apr 4, 2025Updated last year
the-crypt-keeper / the-muse
View on GitHub
Experimental sampler to make LLMs more creative
☆31Aug 2, 2023Updated 2 years ago
Bowen1911 / xJailbreak
View on GitHub
Code of paper: xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking"
☆17Apr 3, 2026Updated 3 months ago
s-smits / grpo-optuna
View on GitHub
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆60Oct 18, 2025Updated 9 months ago
dstackai / LLM-As-Chatbot
View on GitHub
LLM as a Chatbot Service
☆17Aug 28, 2023Updated 2 years ago
facebookresearch / Meta_SecAlign
View on GitHub
Repo for the paper "Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks".
☆70Jun 11, 2026Updated last month
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
qcznlp / uncertainty_attack
View on GitHub
☆23Sep 2, 2025Updated 10 months ago
utkuozbulak / imagenet-adversarial-image-evaluation
View on GitHub
Code and some materials from the papers "Selection of Source Images Heavily Influences the Effectiveness of Adversarial Attacks" (BMVC 20…
☆12Nov 23, 2021Updated 4 years ago
GraySwanAI / nanoGCG
View on GitHub
A fast + lightweight implementation of the GCG algorithm in PyTorch
☆343May 13, 2025Updated last year
Brymir7 / PhaeroOS
View on GitHub
AI Based "Happiness Optimizer"
☆12Oct 20, 2024Updated last year
ChenWu98 / agent-attack
View on GitHub
[ICLR 2025] Dissecting adversarial robustness of multimodal language model agents
☆139Feb 19, 2025Updated last year
hyperfocAIs / Attend
View on GitHub
Attend - to what matters.
☆17Feb 22, 2025Updated last year
eatonphil / btree-rs
View on GitHub
☆21Nov 29, 2023Updated 2 years ago
uw-nsl / CleanGen
View on GitHub
[EMNLP 24] Official Implementation of CLEANGEN: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models
☆19Mar 9, 2025Updated last year
Edward-Sun / RECITE
View on GitHub
Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI
☆96Feb 22, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Tlntin / novel-copilot
View on GitHub
☆24Apr 25, 2023Updated 3 years ago
facebookresearch / SecAlign
View on GitHub
Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"
☆98Jul 2, 2026Updated 2 weeks ago
ChessScholar / Auto-PaLM
View on GitHub
This is a project inspired by Auto-GPT, using Google's PaLM 2 API.
☆19Dec 11, 2023Updated 2 years ago
saferlhf-v / saferlhf-v
View on GitHub
☆23Jun 16, 2025Updated last year
unica-isde / isde
View on GitHub
Industrial Software Development (MSc Computer Engineering, Cybersecurity and AI, University of Cagliari, Italy)
☆24Dec 17, 2025Updated 7 months ago
GraySwanAI / ipi_arena_os
View on GitHub
☆42Mar 18, 2026Updated 4 months ago
prasoongoyal / PixL2R
View on GitHub
☆17Dec 21, 2020Updated 5 years ago