xlang-ai / OSWorld
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
β1,725Updated 3 weeks ago
Alternatives and similar repositories for OSWorld:
Users that are interested in OSWorld are comparing it to the libraries listed below
- Llama-3 agents that can browse the web by following instructions and talking to youβ1,393Updated 3 months ago
- Agentlessπ±: an agentless approach to automatically solve software development problemsβ1,600Updated 3 months ago
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.β1,637Updated 6 months ago
- [ICLR 2025] Agent S: an open agentic framework that uses computers like a humanβ1,407Updated last week
- Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"β711Updated last year
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β735Updated 2 months ago
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Callingβ1,645Updated 8 months ago
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β937Updated last month
- Windows Agent Arena (WAA) πͺ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.β640Updated 3 weeks ago
- Agent driven automation starting with the web. Try it: https://www.emergence.ai/web-automation-apiβ1,073Updated 2 months ago
- OpenCodeInterpreter is a suite of open-source code generation systems aimed at bridging the gap between large language models and sophistβ¦β1,637Updated 10 months ago
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhanβ¦β956Updated 10 months ago
- [ICLR 2025] Automated Design of Agentic Systemsβ1,241Updated 2 months ago
- A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 37.3% tasks (pass@1) in SWE-beβ¦β2,906Updated last week
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web"β808Updated last week
- We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 β¦β839Updated 8 months ago
- β2,485Updated 2 weeks ago
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UIβ1,033Updated 3 months ago
- ππͺ BrowserGym, a Gym environment for web task automationβ650Updated 2 weeks ago
- Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs)β¦β1,222Updated 2 weeks ago
- Together Mixture-Of-Agents (MoA) β 65.1% on AlpacaEval with OSS modelsβ2,717Updated 2 months ago
- agent q - oss advanced reasoning and learning for autonomous ai agentsβ408Updated 6 months ago
- The first open-source Artificial Narrow Intelligence generalist agentic framework Computer-Using-Agent that fully operates graphical-userβ¦β1,295Updated last month
- β846Updated 6 months ago
- An Open Large Reasoning Model for Real-World Solutionsβ1,477Updated 3 weeks ago
- [ICML 2024] Official repository for "Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models"β738Updated 8 months ago
- β927Updated 11 months ago
- Code for Quiet-STaRβ728Updated 7 months ago
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ311Updated last month
- A framework for Claude Opus to intelligently orchestrate subagents.β4,225Updated 9 months ago