showlab / computer_use_ootb
Out-of-the-box (OOTB) GUI Agent for Windows and macOS
☆1,489Updated 2 weeks ago
Alternatives and similar repositories for computer_use_ootb:
Users that are interested in computer_use_ootb are comparing it to the libraries listed below
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.☆1,158Updated 3 weeks ago
- AI computer use powered by open source LLMs and E2B Desktop Sandbox☆1,007Updated 3 weeks ago
- Agent S: an open agentic framework that uses computers like a human☆1,526Updated this week
- Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.☆651Updated 3 weeks ago
- Learn how to use CUA (our Computer Using Agent) via the API on multiple computer environments.☆703Updated this week
- An open-sourced end-to-end VLM-based GUI Agent☆874Updated this week
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.☆317Updated last week
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhan…☆1,033Updated 10 months ago
- [ICLR 2025] Automated Design of Agentic Systems☆1,244Updated 2 months ago
- ☆538Updated this week
- A live stream development of RL tunning for LLM agents☆2,295Updated this week
- ☆810Updated 2 weeks ago
- agent q - oss advanced reasoning and learning for autonomous ai agents☆412Updated 6 months ago
- ☆2,754Updated 2 weeks ago
- ☆3,819Updated last month
- Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction☆273Updated last month
- [CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents☆1,559Updated this week
- 🦀️ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/☆332Updated 4 months ago
- A Model Context Protocol server for searching and analyzing arXiv papers☆773Updated this week
- Dive is an open-source MCP Host Desktop Application that seamlessly integrates with any LLMs supporting function calling capabilities. ✨☆790Updated this week
- ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)☆428Updated 4 months ago
- Desktop app powered by Claude’s computer use capability to control your computer☆423Updated 2 months ago
- Profile-Based Long-Term Memory for AI Applications☆1,011Updated this week
- Open source alternative to Gemini Deep Research. Generate reports with AI based on search results.☆1,759Updated 3 weeks ago
- Agent driven automation starting with the web. Try it: https://www.emergence.ai/web-automation-api☆1,086Updated 2 months ago
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agents☆315Updated last month
- 🏝️ OASIS: Open Agent Social Interaction Simulations with One Million Agents. https://oasis.camel-ai.org☆1,252Updated this week
- A mini, open-weights, version of our Proxy assistant.☆869Updated last month
- Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)☆3,860Updated last week
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆1,333Updated this week