A curated list of trustworthy Generative AI papers. Daily updating...
☆75Sep 4, 2024Updated last year
Alternatives and similar repositories for Awesome-Foundation-Model-Security
Users that are interested in Awesome-Foundation-Model-Security are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"☆87Nov 28, 2023Updated 2 years ago
- ☆17Jul 25, 2022Updated 3 years ago
- Mostly recording papers about models' trustworthy applications. Intending to include topics like model evaluation & analysis, security, c…☆21May 30, 2023Updated 2 years ago
- A list of papers in NeurIPS 2022 related to adversarial attack and defense / AI security.☆76Dec 5, 2022Updated 3 years ago
- [ICLR2025] Detecting Backdoor Samples in Contrastive Language Image Pretraining☆19Feb 26, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆16Feb 23, 2025Updated last year
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …☆35Oct 23, 2024Updated last year
- ☆37Oct 2, 2024Updated last year
- Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models☆270May 13, 2024Updated last year
- A curated list of papers & resources linked to data poisoning, backdoor attacks and defenses against them (no longer maintained)☆288Jan 11, 2025Updated last year
- [ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast☆117Mar 26, 2024Updated 2 years ago
- A re-implementation of the "Extracting Training Data from Large Language Models" paper by Carlini et al., 2020☆39Jul 10, 2022Updated 3 years ago
- ☆12Jan 14, 2026Updated 3 months ago
- A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).☆1,937Apr 2, 2026Updated 2 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official implementation of AAAI'2022 paper "Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement"☆17Dec 23, 2021Updated 4 years ago
- Demo code for the paper: One Thing to Fool them All: Generating Interpretable, Universal, and Physically-Realizable Adversarial Features☆12Nov 30, 2023Updated 2 years ago
- ☆19Mar 9, 2024Updated 2 years ago
- ☆20Dec 14, 2024Updated last year
- Proof-of-concept implementation for the paper "ThermalScope: A Practical Interrupt Side Channel Attack Based On Thermal Event Interrupts"…☆13Dec 17, 2024Updated last year
- ☆10Oct 31, 2022Updated 3 years ago
- [CVPR 2023] T-SEA: Transfer-based Self-Ensemble Attack on Object Detection☆118Oct 11, 2024Updated last year
- Pytorch implementation for the pilot study on the robustness of latent diffusion models.☆12Jun 20, 2023Updated 2 years ago
- A curated list of Meachine learning Security & Privacy papers published in security top-4 conferences (IEEE S&P, ACM CCS, USENIX Security…☆343Nov 11, 2025Updated 5 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"☆65Apr 24, 2024Updated last year
- ☆16Feb 23, 2025Updated last year
- Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM☆39Jan 17, 2025Updated last year
- The code of "NeurJudge: A Circumstance-aware Neural Framework for Legal Judgment Prediction"(SIGIR2021))☆17Jan 3, 2024Updated 2 years ago
- Proof-of-concept implementation for the paper "SegScope: Probing Fine-grained Interrupts via Architectural Footprints" (HPCA'24)☆20Apr 2, 2026Updated 2 weeks ago
- A curation of awesome tools, documents and projects about LLM Security.☆1,565Aug 20, 2025Updated 7 months ago
- From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks☆15Feb 23, 2023Updated 3 years ago
- Official codebase for Image Hijacks: Adversarial Images can Control Generative Models at Runtime☆54Sep 19, 2023Updated 2 years ago
- ☆373Apr 8, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for paper "Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators"☆16Dec 4, 2024Updated last year
- ☆13Jul 8, 2020Updated 5 years ago
- code of paper "IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Gene…☆35May 23, 2024Updated last year
- A collection of resources on attacks and defenses targeting text-to-image diffusion models☆96Dec 20, 2025Updated 3 months ago
- Divide-and-Conquer Attack: Harnessing the Power of LLM to Bypass the Censorship of Text-to-Image Generation Mode☆17Feb 16, 2025Updated last year
- A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide…☆1,827Apr 3, 2026Updated 2 weeks ago
- A curated list of papers on adversarial machine learning (adversarial examples and defense methods).☆211May 27, 2022Updated 3 years ago