joannahuadu / radiation-error-emulator
Simulator.
☆104Updated this week
Alternatives and similar repositories for radiation-error-emulator:
Users that are interested in radiation-error-emulator are comparing it to the libraries listed below
- This is the code repository of our submission: Understanding the Dark Side of LLMs’ Intrinsic Self-Correction.☆56Updated 4 months ago
- [ACL 2024] The official GitHub repo for the paper "The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Pe…☆76Updated 9 months ago
- ☆51Updated 3 months ago
- ☆25Updated 6 months ago
- Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks (IEEE S&P 2024)☆33Updated last week
- Some code for "Stealing Part of a Production Language Model"☆13Updated last year
- This repository is the official implementation of the paper "ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning…☆17Updated last year
- Composite Backdoor Attacks Against Large Language Models☆13Updated last year
- Codes for NeurIPS 2021 paper "Adversarial Neuron Pruning Purifies Backdoored Deep Models"☆57Updated last year
- [WWW '25] Model Supply Chain Poisoning: Backdooring Pre-trained Models via Embedding Indistinguishability☆16Updated 2 months ago
- A toolbox for backdoor attacks.☆21Updated 2 years ago
- SaTML'23 paper "Backdoor Attacks on Time Series: A Generative Approach" by Yujing Jiang, Xingjun Ma, Sarah Monazam Erfani, and James Bail…☆18Updated 2 years ago
- A survey on harmful fine-tuning attack for large language model☆161Updated last week
- ☆18Updated 10 months ago
- [ICLR24] Official Repo of BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models☆33Updated 9 months ago
- [KDD 2024] Is Aggregation the Only Choice? Federated Learning via Layer-wise Model Recombination☆25Updated 5 months ago
- ☆79Updated last year
- [NDSS 2025] Official code for our paper "Explanation as a Watermark: Towards Harmless and Multi-bit Model Ownership Verification via Wate…☆33Updated 5 months ago
- 🔥🔥🔥Breaking long thought processes of o1-like LLMs, such as DeepSeek-R1, QwQ☆28Updated last month
- Comprehensive Assessment of Trustworthiness in Multimodal Foundation Models☆18Updated last month
- SampDetox: Black-box Backdoor Defense via Perturbation-based Sample Detoxification☆11Updated 2 months ago
- ☆20Updated last year
- [ICLR 2024] Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images☆33Updated last year
- Code for paper: PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, IEEE ICASSP 2024. Demo//124.220.228.133:11107☆17Updated 8 months ago
- This is an official repository for Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study (ICCV2023…☆22Updated last year
- [ICLR 2024] Towards Elminating Hard Label Constraints in Gradient Inverision Attacks☆13Updated last year
- This is the official code for the paper "Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturba…☆25Updated last month
- Official Code for ACL 2024 paper "GradSafe: Detecting Unsafe Prompts for LLMs via Safety-Critical Gradient Analysis"☆56Updated 5 months ago
- Repository for Towards Codable Watermarking for Large Language Models☆36Updated last year
- This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"☆36Updated 5 months ago