[NeurIPS 2025] Official repository of RiOSWorld: Benchmarking the Risk of Multimodal Computer-Use Agents
☆118Dec 2, 2025Updated 4 months ago
Alternatives and similar repositories for RiOSWorld
Users that are interested in RiOSWorld are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Repository of "Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Ste…☆27Mar 9, 2026Updated last month
- Codes for paper "SafeAgentBench: A Benchmark for Safe Task Planning of \\ Embodied LLM Agents"☆68Feb 25, 2025Updated last year
- [ACL 2025] Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆59Jul 21, 2025Updated 8 months ago
- ☆128Feb 3, 2025Updated last year
- Official repository of DARE: Diffusion Large Language Models Alignment and Reinforcement Executor☆180Apr 9, 2026Updated last week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Diagnostic Framework for LLMs and MLLMs☆36Mar 2, 2026Updated last month
- [ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shi…☆73Feb 9, 2026Updated 2 months ago
- [EMNLP 2025] The code repo of paper "X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Multi-Turn Jailbreaks without Com…☆40Nov 24, 2025Updated 4 months ago
- ccap `(C)amera(CAP)ture` is a simple and easy-to-use C/C++ camera capture library designed to provide you with simple and efficient camer…☆131Mar 29, 2026Updated 2 weeks ago
- The officalimplement of dLLM-Factory☆25Jul 12, 2025Updated 9 months ago
- Official Implementation of "ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedbac…☆51Mar 25, 2026Updated 3 weeks ago
- [ICML 2024] Code for the paper "MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts"☆10Jul 1, 2024Updated last year
- ☆13Feb 21, 2025Updated last year
- [AAAI 2026] Data and Code for Paper IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks☆43Nov 24, 2025Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A virtual clinical environment for self‑evolving LLM diagnostic agents.☆100Feb 12, 2026Updated 2 months ago
- ☆221Oct 12, 2025Updated 6 months ago
- team Doggeee's solution to Ego4D LTA challenge@CVPRW23'☆14Nov 4, 2023Updated 2 years ago
- [ICLR 2025] This repo is the official implementation of "The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs".☆13Jan 25, 2025Updated last year
- ☆124Feb 6, 2026Updated 2 months ago
- This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulin…☆13Jul 27, 2025Updated 8 months ago
- This is the official implementation of the method presented in the paper "Uncertainty-Aware Test-Time Optimization for 3D Human Pose Esti…☆36Updated this week
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆25Nov 29, 2024Updated last year
- Repo for paper "Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability"☆76Updated this week
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- The code implementation of GraCeFul (Accepted in COLING 2025)☆13Jan 27, 2025Updated last year
- [Preprint 2025] Causal Prompt Calibration Guided Segment Anything Model for Open-Vocabulary Multi-Entity Segmentation☆37Oct 16, 2025Updated 6 months ago
- Llemma formal2formal (tactic prediction) theorem proving experiments☆20Oct 17, 2023Updated 2 years ago
- WraAct is a tool to construct the convex hull of various activation functions.☆33Feb 13, 2026Updated 2 months ago
- Official reposity for paper "High-Dimension Human Value Representation in Large Language Models" (NAACL'25 Main)☆23Jul 9, 2024Updated last year
- Implementation of the paper "Improving the Accuracy-Robustness Trade-off of Classifiers via Adaptive Smoothing".☆10Feb 6, 2024Updated 2 years ago
- 《MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation》☆142Feb 2, 2026Updated 2 months ago
- ☆16Sep 17, 2024Updated last year
- Secure Inference Resilient Against Malicious Clients☆14May 3, 2022Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆30Jan 28, 2026Updated 2 months ago
- Self-healing infrastructure for AI agent payments. 90.3% auto-recovery.☆232Updated this week
- [ACL 2025] Research code for the paper "OS-Kairos: Adaptive Interaction for MLLM-Powered GUI Agents"☆20Jun 19, 2025Updated 9 months ago
- Official code repo for the paper "MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic Environments"☆35Mar 9, 2026Updated last month
- The codes for the paper One-bit Deep Hashing: Towards a Resource-Efficient Hashing Model with Binary Neural Networks (ACMMM24)☆45Mar 4, 2025Updated last year
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆78Dec 8, 2025Updated 4 months ago
- ☆10Mar 8, 2025Updated last year