OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents [NeurIPS 2025 Spotlight]
☆49Sep 18, 2025Updated 5 months ago
Alternatives and similar repositories for os-harm
Users that are interested in os-harm are comparing it to the libraries listed below
Sorting:
- ☆13Jun 23, 2022Updated 3 years ago
- Code for the API, workload execution, and agents underlying the LLMail-Inject Adpative Prompt Injection Challenge☆19Updated this week
- Provable Worst Case Guarantees for the Detection of Out-of-Distribution Data☆13Sep 20, 2022Updated 3 years ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆32Jan 23, 2025Updated last year
- FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens☆17Sep 8, 2025Updated 5 months ago
- On the effectiveness of adversarial training against common corruptions [UAI 2022]☆30May 16, 2022Updated 3 years ago
- ☆15Dec 7, 2021Updated 4 years ago
- [WACV 2026] MomentMix Augmentation with Length-Aware DETR for Temporally Robust Moment Retrieval☆13Sep 18, 2025Updated 5 months ago
- ☆18May 23, 2025Updated 9 months ago
- A School for All Seasons on Trustworthy Machine Learning☆12Jun 30, 2021Updated 4 years ago
- Source code for "Neural Anisotropy Directions"☆16Nov 17, 2020Updated 5 years ago
- Code for the paper "Evading Black-box Classifiers Without Breaking Eggs" [SaTML 2024]☆21Apr 15, 2024Updated last year
- Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]☆21May 2, 2024Updated last year
- A modern look at the relationship between sharpness and generalization [ICML 2023]☆43Sep 11, 2023Updated 2 years ago
- [ICLR'26 Oral] RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments☆34Feb 9, 2026Updated 2 weeks ago
- [ECCV 2024] Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models☆21Jul 17, 2024Updated last year
- Chain-of-Frames [CVPR 2026]☆38Jul 2, 2025Updated 8 months ago
- Source code of "Hold me tight! Influence of discriminative features on deep network boundaries"☆21Dec 10, 2021Updated 4 years ago
- ☆30Jun 19, 2023Updated 2 years ago
- Ludic – an LLM-RL library for the era of experience☆60Jan 9, 2026Updated last month
- [ACL 2024] Code for the paper "ALaRM: Align Language Models via Hierarchical Rewards Modeling"☆25Mar 28, 2024Updated last year
- SGD with large step sizes learns sparse features [ICML 2023]☆33Apr 24, 2023Updated 2 years ago
- [ICML'20] Multi Steepest Descent (MSD) for robustness against the union of multiple perturbation models.☆25Jul 25, 2024Updated last year
- Code and data for the ICLR 2021 paper "Perceptual Adversarial Robustness: Defense Against Unseen Threat Models".☆56Jan 18, 2022Updated 4 years ago
- Sharpness-Aware Minimization Leads to Low-Rank Features [NeurIPS 2023]☆29Sep 22, 2023Updated 2 years ago
- Codebase to fully reproduce the results of "No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO" (M…☆31Nov 20, 2024Updated last year
- ☆46May 8, 2024Updated last year
- 하드코딩으로 아주아주 간단한 챗봇☆10May 25, 2018Updated 7 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- ☆34Nov 7, 2022Updated 3 years ago
- ☆13Aug 28, 2024Updated last year
- Pseudo-Intel-CET functionality plugin based on QEMU 8.2.2 plugin system, with minor modifications to QEMU TCG body code to adapt to Glibc…☆14Jun 5, 2024Updated last year
- The artifact for NDSS '25 paper "ASGARD: Protecting On-Device Deep Neural Networks with Virtualization-Based Trusted Execution Environmen…☆14Oct 16, 2025Updated 4 months ago
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆21Jan 6, 2026Updated last month
- ☆17Dec 14, 2025Updated 2 months ago
- Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups☆51Dec 23, 2024Updated last year
- Time control for simulations☆11Jan 18, 2023Updated 3 years ago
- ☆49Jun 19, 2024Updated last year
- [NeurIPS 2023] Code for the paper "Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threa…☆39Dec 3, 2024Updated last year