[NeurIPS 2025] Official repository of RiOSWorld: Benchmarking the Risk of Multimodal Computer-Use Agents
☆117Dec 2, 2025Updated 3 months ago
Alternatives and similar repositories for RiOSWorld
Users that are interested in RiOSWorld are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Repository of "Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Ste…☆27Mar 9, 2026Updated 2 weeks ago
- ☆20Jun 16, 2025Updated 9 months ago
- A simple anime statitics tracker that helps you explore seasonal anime from 2006 onwards with a lot interesting data and visualization.☆42Mar 15, 2026Updated last week
- Codes for paper "SafeAgentBench: A Benchmark for Safe Task Planning of \\ Embodied LLM Agents"☆65Feb 25, 2025Updated last year
- ☆125Feb 3, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official repository of DARE: dLLM Alignment and Reinforcement Executor☆166Mar 17, 2026Updated last week
- Diagnostic Framework for LLMs and MLLMs☆35Mar 2, 2026Updated 3 weeks ago
- Code for ICCV2025 paper——IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves☆17Jul 11, 2025Updated 8 months ago
- [ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shi…☆73Feb 9, 2026Updated last month
- [EMNLP 2025] The code repo of paper "X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Multi-Turn Jailbreaks without Com…☆40Nov 24, 2025Updated 4 months ago
- The officalimplement of dLLM-Factory☆26Jul 12, 2025Updated 8 months ago
- [ICML 2024] Code for the paper "MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts"☆10Jul 1, 2024Updated last year
- Official Implementation of "ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedbac…☆46Updated this week
- ☆13Feb 21, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- 让AI完全接管你的博客☆30Nov 2, 2025Updated 4 months ago
- [AAAI 2026] Data and Code for Paper IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks☆41Nov 24, 2025Updated 4 months ago
- This is a simple implementation for crypto research agent inspired by Claude Skills repo☆105Jan 28, 2026Updated last month
- ☆221Oct 12, 2025Updated 5 months ago
- [arXiv 2025] "CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought"☆17Apr 3, 2025Updated 11 months ago
- Control your Mac with natural language by converting intent into executable action sequences, with planning, retries, and verifiable outc…☆34Feb 8, 2026Updated last month
- This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulin…☆13Jul 27, 2025Updated 8 months ago
- This is the official implementation of the method presented in the paper "Uncertainty-Aware Test-Time Optimization for 3D Human Pose Esti…☆36Sep 22, 2025Updated 6 months ago
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆24Nov 29, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆92Jan 28, 2026Updated last month
- ☆27Jan 28, 2026Updated last month
- Llemma formal2formal (tactic prediction) theorem proving experiments☆20Oct 17, 2023Updated 2 years ago
- WraAct is a tool to construct the convex hull of various activation functions.☆33Feb 13, 2026Updated last month
- Official reposity for paper "High-Dimension Human Value Representation in Large Language Models" (NAACL'25 Main)☆23Jul 9, 2024Updated last year
- ☆106Feb 4, 2024Updated 2 years ago
- 《MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation》☆136Feb 2, 2026Updated last month
- ☆73Feb 4, 2026Updated last month
- Shadow Attack, LiRA, Quantile Regression and RMIA implementations in PyTorch (Online version)☆14Nov 8, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- 上海交通大学2020春研究生的部分课程作业整理☆16Jun 14, 2020Updated 5 years ago
- [ACL 2025] Research code for the paper "OS-Kairos: Adaptive Interaction for MLLM-Powered GUI Agents"☆19Jun 19, 2025Updated 9 months ago
- 😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond☆349Jan 22, 2026Updated 2 months ago
- A self-made NeurIPS poster template, infused with the unique design style of ShanghaiTech.☆15Dec 26, 2023Updated 2 years ago
- ☆1,552Sep 18, 2025Updated 6 months ago
- [ICLR 2024] "Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality" by Xuxi Chen*, Yu Yang*, Zhangyang Wang, Baha…☆15May 18, 2024Updated last year
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆75Dec 8, 2025Updated 3 months ago