zichuan-liu / IB4LLMsLinks
[NeurIPS'24] Protecting Your LLMs with Information Bottleneck
☆22Updated last year
Alternatives and similar repositories for IB4LLMs
Users that are interested in IB4LLMs are comparing it to the libraries listed below
Sorting:
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆71Updated 6 months ago
- The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source…☆68Updated 10 months ago
- [ICLR 2025] Dissecting adversarial robustness of multimodal language model agents☆115Updated 9 months ago
- ☆50Updated 9 months ago
- Official implementation for "ALI-Agent: Assessing LLMs'Alignment with Human Values via Agent-based Evaluation"☆21Updated 3 months ago
- [ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast☆117Updated last year
- Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups☆47Updated 11 months ago
- ☆18Updated last month
- Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.☆35Updated last year
- Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks☆32Updated last year
- Our research proposes a novel MoGU framework that improves LLMs' safety while preserving their usability.☆18Updated 10 months ago
- ☆75Updated 3 months ago
- ☆153Updated last year
- ☆35Updated last year
- ☆30Updated last year
- ☆22Updated last year