THU-BPM / MarkLLMLinks
MarkLLM: An Open-Source Toolkit for LLM Watermarking.οΌEMNLP 2024 System Demonstration)
β659Updated last month
Alternatives and similar repositories for MarkLLM
Users that are interested in MarkLLM are comparing it to the libraries listed below
Sorting:
- UP-TO-DATE LLM Watermark paper. π₯π₯π₯β363Updated 11 months ago
- AISafetyLab: A comprehensive framework covering safety attack, defense, evaluation and paper list.β212Updated 2 months ago
- β40Updated last year
- [ICLR 2025 Spotlight] The official implementation of our ICLR2025 paper "AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration toβ¦β321Updated last month
- β153Updated last year
- An easy-to-use Python framework to generate adversarial jailbreak prompts.β756Updated 7 months ago
- Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)β133Updated 7 months ago
- π up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.β429Updated last week
- Repository for Towards Codable Watermarking for Large Language Modelsβ38Updated 2 years ago
- This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"β46Updated last month
- β49Updated 8 months ago
- β25Updated 8 months ago
- A Survey on Jailbreak Attacks and Defenses against Multimodal Generative Modelsβ261Updated 2 weeks ago
- β643Updated 2 months ago
- [USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Modelsβ218Updated this week
- Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.β34Updated last year
- [ICLR Workshop 2025] An official source code for paper "GuardReasoner: Towards Reasoning-based LLM Safeguards".β160Updated 6 months ago
- β¨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framworkβ289Updated 2 months ago
- β21Updated last year
- The official implementation of our NAACL 2024 paper "A Wolf in Sheepβs Clothing: Generalized Nested Jailbreak Prompts can Fool Large Langβ¦β145Updated 2 months ago
- [CCS'24] SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Modelsβ137Updated 4 months ago
- Source code of paper "An Unforgeable Publicly Verifiable Watermark for Large Language Models" accepted by ICLR 2024β34Updated last year
- Accepted by IJCAI-24 Survey Trackβ223Updated last year
- β111Updated 9 months ago
- A curated list of resources dedicated to the safety of Large Vision-Language Models. This repository aligns with our survey titled A Survβ¦β165Updated last month
- multi-bit language model watermarking (NAACL 24)β17Updated last year
- γACL 2024γ SALAD benchmark & MD-Judgeβ166Updated 8 months ago
- β136Updated 8 months ago
- A survey on harmful fine-tuning attack for large language modelβ220Updated last week
- Safety at Scale: A Comprehensive Survey of Large Model Safetyβ204Updated 9 months ago