xiaoweih / AISafetyLectureNotesLinks
Machine Learning Safety
☆44Updated 3 weeks ago
Alternatives and similar repositories for AISafetyLectureNotes
Users that are interested in AISafetyLectureNotes are comparing it to the libraries listed below
Sorting:
- Open source implementation of the TrojDRL algorithm presented in TrojDRL: Evaluation of backdoor attacks on Deep Reinforcement Learning☆20Updated 5 years ago
- Attack AlphaZero Go agents (NeurIPS 2022)☆22Updated 3 years ago
- CROWN: A Neural Network Verification Framework for Networks with General Activation Functions☆39Updated 7 years ago
- [S&P 2024] Replication Package for "Mind Your Data! Hiding Backdoors in Offline Reinforcement Learning Datasets".☆31Updated 11 months ago
- ☆27Updated 2 years ago
- official implementation of [USENIX Sec'25] StruQ: Defending Against Prompt Injection with Structured Queries☆54Updated last month
- [ICLR 2020] Code for paper "Robustness Verification for Transformers"☆27Updated last year
- Codes for reproducing the robustness evaluation scores in “Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approac…☆53Updated 7 years ago
- A united toolbox for running major robustness verification approaches for DNNs. [S&P 2023]☆90Updated 2 years ago
- Official PyTorch implementation of "Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian O…☆25Updated 2 years ago
- Adversarial attacks on Deep Reinforcement Learning (RL)☆97Updated 4 years ago
- PrivacyAsst: Safeguarding User Privacy in Tool-Using Large Language Model Agents (TDSC 2024)☆17Updated last year
- Repo for arXiv preprint "Gradient-based Adversarial Attacks against Text Transformers"☆109Updated 2 years ago
- ☆20Updated last year
- Adversarial Examples: Attacks and Defenses for Deep Learning☆32Updated 7 years ago
- This repository contains a simple implementation of Interval Bound Propagation (IBP) using TensorFlow: https://arxiv.org/abs/1810.12715☆161Updated 5 years ago
- ☆21Updated 3 years ago
- ☆19Updated last year
- Machine Learning & Security Seminar @Purdue University☆25Updated 2 years ago
- ☆39Updated last year
- [ICLR'24 Spotlight] DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer☆46Updated last year
- An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)☆107Updated 10 months ago
- Official PyTorch implemetation of paper "X-Adv: Physical Adversarial Object Attacks against X-ray Prohibited Item Detection".☆16Updated 2 years ago
- Is Neuron Coverage a Meaningful Measure for Testing Deep Neural Networks? (FSE 2020)☆10Updated 4 years ago
- ☆100Updated 5 years ago
- Code for paper: "Spinning Language Models: Risks of Propaganda-as-a-Service and Countermeasures"☆21Updated 3 years ago
- All code for the Piecewise Linear Neural Networks verification: A comparative study paper☆35Updated 7 years ago
- Efficient Robustness Verification for ReLU networks (this repository is outdated, don't use; checkout our new implementation at https://g…☆30Updated 6 years ago
- alpha-beta-CROWN: An Efficient, Scalable and GPU Accelerated Neural Network Verifier (winner of VNN-COMP 2021, 2022, 2023, 2024, 2025)☆331Updated this week
- How Robust are Randomized Smoothing based Defenses to Data Poisoning? (CVPR 2021)☆14Updated 4 years ago