zhao-ht / ConvexCertifyLinks
This is the code of our work CISS Certified Robustness Against Natural Language Attacks by Causal Intervention published on ICML 2022
☆11Updated 3 years ago
Alternatives and similar repositories for ConvexCertify
Users that are interested in ConvexCertify are comparing it to the libraries listed below
Sorting:
- Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022☆31Updated 3 years ago
- codes for "Searching for an Effective Defender:Benchmarking Defense against Adversarial Word Substitution"☆31Updated 2 years ago
- ☆14Updated 5 years ago
- Implementation for Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder (EMNLP-Findings 2020)☆15Updated 5 years ago
- Code for the ICLR 2021 Paper "In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness"☆13Updated 4 years ago
- Group-conditional DRO to alleviate spurious correlations☆15Updated 4 years ago
- ☆25Updated 4 years ago
- ☆43Updated 2 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆10Updated last year
- ☆14Updated last year
- ACL 2021 - Defense against Adversarial Attacks in NLP via Dirichlet Neighborhood Ensemble☆18Updated 2 years ago
- ☆48Updated 10 months ago
- ☆37Updated 11 months ago
- Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".☆29Updated last year
- Align your LM to express calibrated verbal statements of confidence in its long-form generations.☆28Updated last year
- Code for Environment Inference for Invariant Learning (ICML 2021 Paper)☆51Updated 4 years ago
- ☆33Updated 3 months ago
- Adversarial Training with Fast Gradient Projection Method against Synonym Substitution based Text Attacks☆24Updated 5 years ago
- For Certified Robustness to Text Adversarial Attacks by Randomized [MASK]☆17Updated last year
- ☆17Updated 4 years ago
- Official code repository for Correct-N-Contrast☆23Updated 3 years ago
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆31Updated 10 months ago
- [ICLR 2022] Boosting Randomized Smoothing with Variance Reduced Classifiers☆12Updated 3 years ago
- This is a PyTorch reimplementation of Influence Functions from the ICML2017 best paper: Understanding Black-box Predictions via Influence…☆17Updated 5 years ago
- Continual Memorization of Factoids in Large Language Models☆10Updated last year
- [ICLR 2020] Code for paper "Robustness Verification for Transformers"☆27Updated last year
- ☆25Updated 4 years ago
- SAFER: A Structure-free Approach For cErtified Robustness to Adversarial Word Substitutions (ACL 2020)☆31Updated 4 years ago
- Mostly recording papers about models' trustworthy applications. Intending to include topics like model evaluation & analysis, security, c…☆21Updated 2 years ago
- Code repository for the paper "Invariant and Transportable Representations for Anti-Causal Domain Shifts"☆16Updated 3 years ago