SALT-NLP / PopupAttackLinks

Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups

☆44

Alternatives and similar repositories for PopupAttack

Users that are interested in PopupAttack are comparing it to the libraries listed below

Sorting:

Gen-Verse / CURE
[NeurIPS 2025 Spotlight] ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning
☆130Updated last month
HKUNLP / critic-rl
[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning
☆114Updated 5 months ago
sail-sg / Agent-Smith
[ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
☆115Updated last year
ChenWu98 / agent-attack
[ICLR 2025] Dissecting adversarial robustness of multimodal language model agents
☆110Updated 8 months ago
hkust-nlp / B-STaR
B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
☆85Updated 5 months ago
GAIR-NLP / OlympicArena
[NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
☆107Updated 7 months ago
test-time-interaction / TTI
☆63Updated 4 months ago
sail-sg / AnytimeReasoner
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
☆47Updated 3 months ago
ChnQ / TracingLLM
☆30Updated last year
AI4Good24 / PsySafe
☆49Updated 8 months ago
ReasoningTransfer / Transferability-of-LLM-Reasoning
☆101Updated 3 weeks ago
safety-research / SHADE-Arena
☆19Updated 4 months ago
TIGER-AI-Lab / AceCoder
The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]
☆92Updated 6 months ago
DYR1 / MoGU
Our research proposes a novel MoGU framework that improves LLMs' safety while preserving their usability.
☆18Updated 9 months ago
GXimingLu / IPA
Codebase for Inference-Time Policy Adapters
☆24Updated last year
HumanEval-V / HumanEval-V-Benchmark
A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks
☆12Updated 8 months ago
VisualWebBench / VisualWebBench
Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
☆59Updated last year
SafeAILab / RAIN
[ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning
☆99Updated last year
sail-sg / dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
☆44Updated 6 months ago
zhxieml / remiss-jailbreak
☆33Updated last year
XuandongZhao / weak-to-strong
[ICML 2025] Weak-to-Strong Jailbreaking on Large Language Models
☆87Updated 6 months ago
xbmxb / EnvDistraction
☆22Updated last year
zitian-gao / SC-MCTS
Interpretable Contrastive Monte Carlo Tree Search Reasoning
☆48Updated 11 months ago
zjunlp / steer-target-atoms
[ACL 2025] Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms
☆26Updated 4 months ago
OSU-NLP-Group / AgentAttack
☆22Updated last year
eric-ai-lab / MSSBench
[ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"
☆28Updated 4 months ago
declare-lab / ferret
Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique
☆18Updated last year
sail-sg / Cheating-LLM-Benchmarks
[ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)
☆84Updated last year
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆121Updated 7 months ago
ryoungj / ToolEmu
[ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use
☆167Updated last year