scaleapi / browser-artLinks
☆34Updated 7 months ago
Alternatives and similar repositories for browser-art
Users that are interested in browser-art are comparing it to the libraries listed below
Sorting:
- [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use☆166Updated last year
- [ICLR 2025] Dissecting adversarial robustness of multimodal language model agents☆110Updated 8 months ago
- Improving Alignment and Robustness with Circuit Breakers☆238Updated last year
- VisualWebArena is a benchmark for multimodal agents.☆392Updated 11 months ago
- WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning m…☆146Updated 4 months ago
- Code for the paper 🌳 Tree Search for Language Model Agents☆217Updated last year
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆119Updated last year
- This repository contains the code and data for the paper "SelfIE: Self-Interpretation of Large Language Model Embeddings" by Haozhe Chen,…☆52Updated 10 months ago
- ☆17Updated 4 months ago
- OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents [NeurIPS 2025 Spotlight]☆38Updated last month
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and re…☆434Updated this week
- ☆185Updated last year
- ☆35Updated last year
- Code to break Llama Guard☆32Updated last year
- ☆138Updated 3 months ago
- ☆119Updated 5 months ago
- [TMLR'25] "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"☆88Updated 3 weeks ago
- Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs☆92Updated 10 months ago
- Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873☆168Updated last year
- Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding☆146Updated last year
- Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizX…☆83Updated last year
- WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?☆214Updated 2 weeks ago
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆201Updated last year
- Collection of evals for Inspect AI☆264Updated this week
- [ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"☆62Updated 4 months ago
- An Illusion of Progress? Assessing the Current State of Web Agents☆100Updated 3 months ago
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆115Updated last year
- Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".☆290Updated 4 months ago
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆72Updated 3 months ago
- [NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"☆160Updated 6 months ago