Official implementation of “Response Attack: Exploiting Contextual Priming to Jailbreak Large Language Models” (AAAI 2026).
☆31Dec 17, 2025Updated 2 months ago
Alternatives and similar repositories for Response-Attack
Users that are interested in Response-Attack are comparing it to the libraries listed below
Sorting:
- Official implementation of Visco-Attack (EMNLP 2025 Main). We will progressively release the code and one-click reproduction scripts.☆30Aug 22, 2025Updated 6 months ago
- DataBaseLab,XJTU 西交数据库实验☆10Jun 25, 2024Updated last year
- Moneybags: a cashflow management system☆10Jul 4, 2024Updated last year
- Official Implementation of "Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts" at EMNLP 202…☆13Oct 27, 2024Updated last year
- The repo for using the model https://huggingface.co/thu-coai/Attacker-v0.1☆13Apr 23, 2025Updated 10 months ago
- AI-enabled document processing engine☆16Feb 11, 2026Updated 3 weeks ago
- ☆29Feb 12, 2026Updated 2 weeks ago
- Instant Graph Neural Networks for Dynamic Graphs☆11Dec 28, 2022Updated 3 years ago
- ☆11Jan 19, 2025Updated last year
- [WSDM 2026] LookAhead Tuning: Safer Language Models via Partial Answer Previews☆17Dec 14, 2025Updated 2 months ago
- ☆11Feb 2, 2024Updated 2 years ago
- ☆18Apr 7, 2025Updated 10 months ago
- [ICLR 2026] "When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms"☆26Feb 3, 2026Updated last month
- [ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion☆58Oct 1, 2025Updated 5 months ago
- Diagnostic Framework for LLMs and MLLMs☆31Feb 6, 2026Updated 3 weeks ago
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆25Oct 7, 2025Updated 4 months ago
- Graph Coarsening with Neural Networks☆11Mar 3, 2022Updated 4 years ago
- ☆12Sep 10, 2024Updated last year
- Welcome to the official repository for Siren, a project aimed at understanding and mitigating harmful behaviors in large language models …☆15Sep 12, 2025Updated 5 months ago
- ☆24May 23, 2025Updated 9 months ago
- The code of Dynamic Graph Learning Based on Hierarchical Memory for Origin-Destination Demand Prediction☆14Apr 29, 2022Updated 3 years ago
- A lightweight tool for detecting bugs on Graph Database Management Systems☆15Jan 9, 2024Updated 2 years ago
- The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. 🎉☆12Jul 19, 2023Updated 2 years ago
- Research on "Many-Shot Jailbreaking" in Large Language Models (LLMs). It unveils a novel technique capable of bypassing the safety mechan…☆16Aug 6, 2024Updated last year
- ☆36Jun 14, 2025Updated 8 months ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆14Jun 21, 2024Updated last year
- Jailbreak Evo☆21Jun 2, 2025Updated 9 months ago
- Official repository for "On the Multi-modal Vulnerability of Diffusion Models"☆16Jul 15, 2024Updated last year
- ☆15Aug 1, 2023Updated 2 years ago
- [CVPR 2025] Official implementation for JOOD "Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy"☆21Jun 11, 2025Updated 8 months ago
- ☆17Feb 22, 2024Updated 2 years ago
- [ICSE'25] Aligning the Objective of LLM-based Program Repair☆23Mar 8, 2025Updated 11 months ago
- [NeurIPS'24] Protecting Your LLMs with Information Bottleneck☆25Nov 7, 2024Updated last year
- ☆24Feb 17, 2026Updated 2 weeks ago
- ☆22Sep 2, 2025Updated 6 months ago
- [CVPR2025] T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation☆32Jul 10, 2025Updated 7 months ago
- ☆26Mar 17, 2025Updated 11 months ago
- ☆32Feb 20, 2026Updated last week
- ☆20Feb 11, 2024Updated 2 years ago