Internal Consistency Regularization (CROW) for LLM Backdoor Elimination - Paper accepted to ICML 2025
☆16May 6, 2025Updated 11 months ago
Alternatives and similar repositories for CROW
Users that are interested in CROW are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code associated with ICML (2024). "Defense against Backdoor Attack on Pre-trained Language Models via Head Pruning and Attention Normaliz…☆10Feb 22, 2026Updated 2 months ago
- Official implementation repository for the paper Towards General Conceptual Model Editing via Adversarial Representation Engineering.☆20Dec 6, 2024Updated last year
- JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning☆10Nov 3, 2024Updated last year
- The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. 🎉☆12Jul 19, 2023Updated 2 years ago
- ☆27May 27, 2020Updated 5 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆22Oct 25, 2024Updated last year
- ☆39May 21, 2025Updated 11 months ago
- Code and dataset for the paper: "Can Editing LLMs Inject Harm?"☆21Dec 26, 2025Updated 4 months ago
- ACL 2023 paper "A Critical Evaluation of Evaluations for Long-form Question Answering"☆21Mar 22, 2024Updated 2 years ago
- NeurIPS'24 - LLM Safety Landscape☆39Oct 21, 2025Updated 6 months ago
- Test implementation of "Aligned Cross Entropy for Non-Autoregressive Machine Translation" https://arxiv.org/abs/2004.01655☆21Jul 25, 2024Updated last year
- Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)☆24Mar 18, 2021Updated 5 years ago
- [COLM'24] How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?☆22Oct 13, 2024Updated last year
- PyTorch 1.11 reimplementation of multi task gradient adaptation ideas: Gradient Surgery (PCGrad) and Gradient Vaccine☆21Jun 20, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official code for FAccT'21 paper "Fairness Through Robustness: Investigating Robustness Disparity in Deep Learning" https://arxiv.org/abs…☆13Mar 9, 2021Updated 5 years ago
- With the rapid adoption of smartphones, tablets, and mobile apps, they are increasingly becoming part of children’s daily life for amusem…☆12Apr 7, 2017Updated 9 years ago
- Implementation of "DeepWriter: A Multi-Stream Deep CNN for Text-independent Writer Identification"☆16Feb 3, 2020Updated 6 years ago
- Pytorch implementation of NPAttack☆12Jul 7, 2020Updated 5 years ago
- Implementation of our paper in EMNLP 2022, focused on the relationship between parent and child in transfer learning for low-resourc…☆17Dec 7, 2022Updated 3 years ago
- [ICLR2025] Detecting Backdoor Samples in Contrastive Language Image Pretraining☆19Feb 26, 2025Updated last year
- ☆20Feb 18, 2024Updated 2 years ago
- Code for paper "Membership Inference Attacks Against Vision-Language Models"☆29Jan 25, 2025Updated last year
- Image Shortcut Squeezing: Countering Perturbative Availability Poisons with Compression☆14Mar 22, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ICSE-SEIP'21] Robustness of on-device Models: AdversarialAttack to Deep Learning Models on Android Apps☆15Jun 2, 2022Updated 3 years ago
- ☆20Oct 29, 2023Updated 2 years ago
- ☆24Dec 8, 2024Updated last year
- ☆11Oct 18, 2022Updated 3 years ago
- [NeurIPS 2024] Fight Back Against Jailbreaking via Prompt Adversarial Tuning☆11Oct 29, 2024Updated last year
- Implementation of HistoSketch and D2HistoSketch in MATLAB☆19Aug 29, 2018Updated 7 years ago
- Implementation of TABOR: A Highly Accurate Approach to Inspecting and Restoring Trojan Backdoors in AI Systems (https://arxiv.org/pdf/190…☆19Apr 13, 2023Updated 3 years ago
- Implementation for <Understanding Robust Overftting of Adversarial Training and Beyond> in ICML'22.☆13Jul 1, 2022Updated 3 years ago
- ☆12Apr 27, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Codes for NeurIPS 2021 paper "Adversarial Neuron Pruning Purifies Backdoored Deep Models"☆63May 8, 2023Updated 2 years ago
- ☆18Jun 27, 2021Updated 4 years ago
- ☆24Mar 12, 2024Updated 2 years ago
- Code Repository for the Paper ---Revisiting the Assumption of Latent Separability for Backdoor Defenses (ICLR 2023)☆47Feb 28, 2023Updated 3 years ago
- Official codebase for Image Hijacks: Adversarial Images can Control Generative Models at Runtime☆54Sep 19, 2023Updated 2 years ago
- Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"☆61Jan 15, 2025Updated last year
- ☆35Oct 22, 2025Updated 6 months ago