alipay / YiJian-Community
YiJian-Comunity: a full-process automated large model safety evaluation tool designed for academic research
☆72Updated last month
Related projects ⓘ
Alternatives and complementary repositories for YiJian-Community
- Accepted by IJCAI-24 Survey Track☆159Updated 2 months ago
- 😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.☆133Updated last week
- Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM☆20Updated 5 months ago
- Accepted by ECCV 2024☆74Updated last month
- ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]☆158Updated last month
- ☆19Updated 4 months ago
- S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models☆42Updated 2 weeks ago
- A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)☆108Updated 2 weeks ago
- Submission Guide + Discussion Board for AI Singapore Global Challenge for Safe and Secure LLMs (Track 1A).☆16Updated 4 months ago
- A package that achieves 95%+ transfer attack success rate against GPT-4☆14Updated 3 weeks ago
- JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and further assess …☆35Updated 4 months ago
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …☆15Updated 3 weeks ago
- ☆37Updated 3 months ago
- ☆53Updated 3 weeks ago
- A collection list of AIGC detection related papers.☆68Updated last month
- Attack to induce LLMs within hallucinations☆110Updated 6 months ago
- ☆93Updated 2 months ago
- [ICLR'24 Spotlight] The official codes of our work on AIGC detection: "Multiscale Positive-Unlabeled Detection of AI-Generated Texts"☆106Updated 10 months ago
- A collection of resources on attacks and defenses targeting text-to-image diffusion models☆46Updated last week
- [arXiv:2311.03191] "DeepInception: Hypnotize Large Language Model to Be Jailbreaker"☆121Updated 9 months ago
- ☆16Updated 5 months ago
- [NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey☆76Updated 3 months ago
- ☆22Updated 2 weeks ago
- [ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shi…☆45Updated 4 months ago
- Towards Safe LLM with our simple-yet-highly-effective Intention Analysis Prompting☆13Updated 7 months ago
- Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]☆159Updated 4 months ago
- ☆86Updated 9 months ago
- ☆34Updated 9 months ago
- ☆30Updated 5 months ago
- Jailbreaking Large Vision-language Models via Typographic Visual Prompts☆87Updated 6 months ago