☆23Oct 25, 2024Updated last year
Alternatives and similar repositories for AgentAttack
Users that are interested in AgentAttack are comparing it to the libraries listed below
Sorting:
- Code and dataset for the paper: "Can Editing LLMs Inject Harm?"☆21Dec 26, 2025Updated 2 months ago
- Repository for the Paper: Refusing Safe Prompts for Multi-modal Large Language Models☆18Oct 16, 2024Updated last year
- ☆117Jul 2, 2024Updated last year
- Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"☆32Mar 31, 2025Updated 11 months ago
- Code for Voice Jailbreak Attacks Against GPT-4o.☆36May 31, 2024Updated last year
- Machine Learning & Security Seminar @Purdue University☆25May 9, 2023Updated 2 years ago
- Make reasoning models scalable☆47May 31, 2025Updated 9 months ago
- ☆11Dec 23, 2024Updated last year
- AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM☆83Nov 3, 2024Updated last year
- [ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.☆85Jan 19, 2025Updated last year
- utilities☆15Jul 2, 2013Updated 12 years ago
- ☆12May 6, 2022Updated 3 years ago
- A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.☆106Apr 15, 2024Updated last year
- This repo is for the safety topic, including attacks, defenses and studies related to reasoning and RL☆61Sep 5, 2025Updated 5 months ago
- [USENIX'25] HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns☆13Mar 1, 2025Updated last year
- The Melbourne Open Data Playground GitHub community page is a part of the Melbourne Open Data Playground (MOP), an industry capstone proj…☆17Sep 10, 2025Updated 5 months ago
- BrainWash: A Poisoning Attack to Forget in Continual Learning☆12Apr 15, 2024Updated last year
- Code for AAAI21 paper "Scalable and Explainable 1-Bit Matrix Completion via Graph Signal Learning"☆11Feb 15, 2022Updated 4 years ago
- On the Robustness of GUI Grounding Models Against Image Attacks☆12Apr 8, 2025Updated 10 months ago
- Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]☆109Sep 27, 2024Updated last year
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆98May 23, 2024Updated last year
- PyTorch code for the Neurips 2021 paper: Fairness via Representation Neutralization☆10Oct 26, 2021Updated 4 years ago
- A tool for extracting, modifying, and crafting ASDM binary packages (CVE-2022-20829)☆13Aug 15, 2022Updated 3 years ago
- ☆14Feb 26, 2025Updated last year
- This is the official repository of the following paper: "Achieving Fairness Through Channel Pruning for Dermatological Disease Diagnosis"…☆10Jan 4, 2025Updated last year
- A basic Google Docs document viewer.☆11Aug 22, 2019Updated 6 years ago
- A friendly UI for arXiv hosting papers on fairness and ethics in Machine Learning & Data Science☆12Jul 4, 2019Updated 6 years ago
- Code for AISTATS'25 paper - On the Power of Adaptive Weighted Aggregation in Heterogeneous Federated Learning and Beyond☆13Sep 23, 2025Updated 5 months ago
- 8 labs of course Introduction to Computer System☆10Jun 17, 2014Updated 11 years ago
- Neural Turing Machine☆13Jun 18, 2018Updated 7 years ago
- Code for running forward and backward versions of GPT2☆10Nov 20, 2021Updated 4 years ago
- Official Code Implementation for the CCS 2022 Paper "On the Privacy Risks of Cell-Based NAS Architectures"☆11Nov 21, 2022Updated 3 years ago
- Can Large Language Models Identify Authorship? (EMNLP 2024 Findings)☆12Feb 4, 2025Updated last year
- [ICLR 2022] Boosting Randomized Smoothing with Variance Reduced Classifiers☆12Mar 29, 2022Updated 3 years ago
- An automated multi-step research system for executing deep, comprehensive research with iterative refinement, source evaluation, and resu…☆14Mar 11, 2025Updated 11 months ago
- Concurrent hash tries for C++ 14 with no memory management whatsoever.☆10Aug 30, 2016Updated 9 years ago
- Spectral Graph Attention Network with Fast Eigen-approximation☆12Dec 24, 2021Updated 4 years ago
- FFT for PyCuda and PyOpenCL. The package is deprecated and its functionality is merged into Reikna.☆37Feb 17, 2014Updated 12 years ago
- The technical paper for the Spark protocol☆11Mar 5, 2024Updated last year