SALT-NLP / PopupAttackLinks
Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups
☆48Updated 11 months ago
Alternatives and similar repositories for PopupAttack
Users that are interested in PopupAttack are comparing it to the libraries listed below
Sorting:
- ☆30Updated last year
- [ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast☆117Updated last year
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆62Updated last year
- ☆23Updated last year
- [ICLR 2025] Dissecting adversarial robustness of multimodal language model agents☆120Updated 9 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆86Updated 6 months ago
- Codebase for Inference-Time Policy Adapters☆24Updated 2 years ago
- ☆22Updated last year
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆85Updated last year
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆98Updated last year
- ☆50Updated 10 months ago
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆174Updated last year
- A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Langu…☆85Updated this week
- ☆20Updated 5 months ago
- Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks☆32Updated last year
- Our research proposes a novel MoGU framework that improves LLMs' safety while preserving their usability.☆18Updated 11 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆118Updated 7 months ago
- [NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning☆143Updated 2 months ago
- ☆33Updated last year
- [TMLR'25] "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"☆93Updated 2 months ago
- ☆66Updated 6 months ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Updated last year
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)☆93Updated 7 months ago
- A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks☆13Updated 9 months ago
- The official repository of the paper "On the Exploitability of Instruction Tuning".☆66Updated last year
- ☆34Updated 7 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆147Updated last year
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆35Updated 9 months ago
- Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"☆48Updated 6 months ago
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs☆29Updated 6 months ago