AAAAAAsuka / llm_defendsView external linksLinks
code of paper "Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM"
☆14Nov 17, 2023Updated 2 years ago
Alternatives and similar repositories for llm_defends
Users that are interested in llm_defends are comparing it to the libraries listed below
Sorting:
- Code for MBGE-recognition: Emotion recognition based on multi-view body gestures, accepted at ICIP 2019.☆12Apr 6, 2023Updated 2 years ago
- Code for R-former: Legal Judgment Prediction via Relational Learning, accepted at SIGIR 2021.☆23Feb 21, 2022Updated 3 years ago
- Code for KERM: Incorporating Explicit Knowledge in Pre-trained Language Models for Passage Re-ranking, accepted at SIGIR 2022.☆19Oct 31, 2022Updated 3 years ago
- [AAAI2022] Code Release of Attacking Video Recognition Models with Bullet-Screen Comments☆25Mar 30, 2024Updated last year
- Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM☆39Jan 17, 2025Updated last year
- code of paper "IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Gene…☆34May 23, 2024Updated last year
- ☆11Apr 6, 2019Updated 6 years ago
- [USENIX'25] HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns☆13Mar 1, 2025Updated 11 months ago
- Official reponsitory for "S^2IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting"☆53Jul 17, 2024Updated last year
- The official codes for our paper at COLING 2022: Semantic-Preserving Adversarial Code Comprehension☆12Oct 23, 2022Updated 3 years ago
- A supervised fine-tuning method for controllable reasoning length in large language models (一种通过有监督微调实现大语言模型思考长度可控的方法)☆10May 8, 2025Updated 9 months ago
- Github Repo for ICML 2022 paper: Communication-Efficient Adaptive Federated Learning☆10Nov 18, 2022Updated 3 years ago
- To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …☆11Jun 18, 2024Updated last year
- ☆10Jul 24, 2023Updated 2 years ago
- A-Soul-Data Json数据存放☆13Sep 17, 2022Updated 3 years ago
- ☆10Oct 28, 2020Updated 5 years ago
- Official Repo of Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents☆58Oct 28, 2025Updated 3 months ago
- ☆21Jul 8, 2025Updated 7 months ago
- ☆13May 15, 2025Updated 9 months ago
- ☆11Jul 19, 2022Updated 3 years ago
- An end-to-end framework for pulmonary airway analysis☆18Jan 19, 2026Updated 3 weeks ago
- THUIR website☆10Feb 4, 2026Updated last week
- ICML2025: One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework☆14Jun 24, 2025Updated 7 months ago
- Generating Human Skeletons with Mutual Actions☆11Oct 22, 2021Updated 4 years ago
- The official implementation for Common Sense Enhanced Knowledge-based Recommendation with Large Language Model☆13Apr 21, 2024Updated last year
- TurboFuzzLLM: Turbocharging Mutation-based Fuzzing for Effectively Jailbreaking Large Language Models in Practice☆22Nov 24, 2025Updated 2 months ago
- ☆16Nov 18, 2024Updated last year
- [COLING 2025🔥] Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection☆16Jan 21, 2025Updated last year
- Official repository for WWW'24 paper "MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation"☆12Jul 25, 2024Updated last year
- Sparse-RS: a versatile framework for query-efficient sparse black-box adversarial attacks☆46Feb 24, 2022Updated 3 years ago
- HSML Dynamic version for ICML 2019☆12Jul 11, 2019Updated 6 years ago
- ☆12Aug 16, 2018Updated 7 years ago
- ☆16Feb 17, 2025Updated 11 months ago
- ☆22Sep 5, 2025Updated 5 months ago
- enchmarking Large Language Models' Resistance to Malicious Code☆14Dec 1, 2024Updated last year
- [EMNLP'22] Textual Manifold-based Defense Against Natural Language Adversarial Examples☆11Apr 6, 2023Updated 2 years ago
- Project of ACL 2025 "UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models"☆15Mar 25, 2025Updated 10 months ago
- Prototypical Contrast and Reverse Prediction: Unsupervised Skeleton based Action Recognition☆11Aug 30, 2021Updated 4 years ago
- ☆17Jan 5, 2026Updated last month