UNHSAILLab/working-memory-attack-on-llms

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/UNHSAILLab/working-memory-attack-on-llms)

UNHSAILLab / working-memory-attack-on-llms

Working Memory Attack on LLMs

☆18

Alternatives and similar repositories for working-memory-attack-on-llms

Users that are interested in working-memory-attack-on-llms are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

UNHSAILLab / TaCo
View on GitHub
TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes
☆14Jul 1, 2025Updated last year
behzadanksu / rl-attack
View on GitHub
Adversarial Example Attacks on Policy Learners
☆40Jul 23, 2020Updated 6 years ago
wegodev2 / virtual-prompt-injection
View on GitHub
Unofficial implementation of "Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection"
☆27Jul 6, 2024Updated 2 years ago
behzadanksu / cybertweets
View on GitHub
annotated dataset of cyber-security related tweets
☆22May 10, 2021Updated 5 years ago
au-revoir / model-editing-ft
View on GitHub
☆13Sep 8, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
zhaoyiran924 / Probe-Sampling
View on GitHub
[NeurIPS 2024] Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling
☆35Nov 8, 2024Updated last year
weizeming / momentum-attack-llm
View on GitHub
☆25Jan 17, 2025Updated last year
datasec-lab / CodeBreaker
View on GitHub
[USENIX Security '24] An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities agai…
☆60Mar 22, 2025Updated last year
meng-wenlong / LMSanitator
View on GitHub
☆29Aug 21, 2023Updated 2 years ago
jwergieluk / revllm
View on GitHub
RevLLM -- Reverse Engineering Tools for Large Language Models
☆22Feb 29, 2024Updated 2 years ago
Confirm-Solutions / flrt
View on GitHub
Fluent student-teacher redteaming
☆23Jul 25, 2024Updated 2 years ago
zhangrui4041 / Instruction_Backdoor_Attack
View on GitHub
☆25Aug 21, 2024Updated last year
Teddy-Li / LLM-NLI-Analysis
View on GitHub
☆15Jul 8, 2023Updated 3 years ago
SCLBD / DBD
View on GitHub
☆32Mar 4, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ZiangYan / pda.pytorch
View on GitHub
Implementation of our ICLR 2021 paper: Policy-Driven Attack: Learning to Query for Hard-label Black-box Adversarial Examples.
☆11Mar 9, 2021Updated 5 years ago
mdrafiqulrabin / SIVAND
View on GitHub
ESEC/FSE'21: Prediction-Preserving Program Simplification
☆10Oct 4, 2022Updated 3 years ago
bboylyg / RNP
View on GitHub
Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023)
☆40Dec 24, 2023Updated 2 years ago
epignatelli / reinforcement-learning-an-introduction
View on GitHub
A python implementation of the concepts in the book "Reinforcement Learning: An Introduction" by R.S. Sutton and A. G. Barto.
☆21Jul 13, 2020Updated 6 years ago
UNHSAILLab / SentimentalLIAR
View on GitHub
Our Sentimental LIAR dataset is a modified and further extended version of the LIAR extension introduced by Kirilin et al. In our dataset…
☆16Mar 31, 2022Updated 4 years ago
Bowen1911 / xJailbreak
View on GitHub
Code of paper: xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking"
☆17Apr 3, 2026Updated 3 months ago
Beichen1996 / SRAAL
View on GitHub
State-Relabeling Adversarial Active Learning
☆14Aug 17, 2021Updated 4 years ago
hyggs / Anomaly-Detection-and-Attack-Identification-in-Network-Traffic-Based-on-Graph
View on GitHub
A project from EECS6414M of Winter 2020 at York University
☆11Mar 26, 2020Updated 6 years ago
KatherLab / prompt_injection_attacks
View on GitHub
☆14Dec 28, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Django-Jiang / BadChain
View on GitHub
[ICLR24] Official Repo of BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
☆56Jul 24, 2024Updated 2 years ago
xingyizhao / PURE
View on GitHub
Code associated with ICML (2024). "Defense against Backdoor Attack on Pre-trained Language Models via Head Pruning and Attention Normaliz…
☆11Feb 22, 2026Updated 5 months ago
Reapor-Yurnero / imprompter
View on GitHub
Codebase of https://arxiv.org/abs/2410.14923
☆54Oct 22, 2024Updated last year
EIDOSLAB / unbiased-contrastive-learning
View on GitHub
Code for the paper "Unbiased Supervised Contrastive Learning" | ICLR 2023 https://openreview.net/forum?id=Ph5cJSfD2XN
☆12Sep 22, 2023Updated 2 years ago
thunlp / OpenBackdoor
View on GitHub
An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)
☆209Apr 10, 2023Updated 3 years ago
TDteach / Demon-in-the-Variant
View on GitHub
☆13Oct 21, 2021Updated 4 years ago
jiwoongim / DVAE-Pytorch-
View on GitHub
Denoising Variational Autoencoder
☆20Apr 26, 2018Updated 8 years ago
kangjie-chen / BadPre
View on GitHub
☆12Feb 21, 2022Updated 4 years ago
ngoc-nguyen-0 / LOKT_neurips2023
View on GitHub
☆13Apr 13, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
NTDXYG / COTTON
View on GitHub
Data and code for "Chain-of-Thought in Neural Code Generation: From and For Lightweight Language Models", which accepted in TSE.
☆15Jul 3, 2024Updated 2 years ago
EleutherAI / deep-ignorance
View on GitHub
☆20Jan 7, 2026Updated 6 months ago
wssun / KillBadCode
View on GitHub
Show Me Your Code! Kill Code Poisoning: A Lightweight Method Based on Code Naturalness
☆19Jul 17, 2025Updated last year
shuaizhao95 / ICLAttack
View on GitHub
ICL backdoor attack
☆17Nov 4, 2024Updated last year
dessertlab / Targeted-Data-Poisoning-Attacks
View on GitHub
This repository contains the code, the dataset and the experimental results related to the paper "Vulnerabilities in AI Code Generators: …
☆14Aug 5, 2024Updated last year
dut-laowang / PCLMM
View on GitHub
The code implementation for the article "Towards Patronizing and Condescending Language in Chinese Videos: A Multimodal Dataset and Fram…
☆16Apr 3, 2025Updated last year
deqangss / malware-uncertainty
View on GitHub
deep learning, malware detection, predictive uncertainty, dataset shift, calibration, uncertainty quantification, android malware
☆17Nov 30, 2021Updated 4 years ago