xlang-ai / OSWorld
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
☆1,326Updated this week
Related projects ⓘ
Alternatives and complementary repositories for OSWorld
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.☆1,505Updated 2 months ago
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,349Updated 3 months ago
- Automated Design of Agentic Systems☆1,016Updated this week
- [ICLR 2024] SWE-bench: Can Language Models Resolve Real-world Github Issues?☆1,943Updated last week
- OpenCodeInterpreter is a suite of open-source code generation systems aimed at bridging the gap between large language models and sophist…☆1,591Updated 6 months ago
- Agent S: an open agentic framework that uses computers like a human☆556Updated this week
- Agentless🐱: an agentless approach to automatically solve software development problems☆710Updated last week
- The first open source Large Action Model generalist Artificial Narrow Intelligence agentic framework that controls completely human user …☆1,263Updated 4 months ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large mult…☆637Updated 2 weeks ago
- Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs)…☆971Updated this week
- A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 30.67% tasks (pass@1) in SWE-b…☆2,711Updated last week
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhan…☆481Updated 5 months ago
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,504Updated 3 months ago
- Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.☆460Updated this week
- Optimizing inference proxy for LLMs☆1,329Updated this week
- Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models☆2,593Updated 3 weeks ago
- We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 …☆810Updated 4 months ago
- The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework Join our Community: https://discord.com/servers/agora-999382051…☆1,720Updated this week
- Python & JS/TS SDK for running AI-generated code/code interpreting in your AI app☆1,247Updated last week
- Common interface for interacting with AI agents. The protocol is tech stack agnostic - you can use it with any framework for building age…☆990Updated 4 months ago
- Agent driven automation starting with the web. Discord: https://discord.gg/wgNfmFuqJF☆799Updated this week
- Code for Quiet-STaR☆639Updated 2 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆960Updated 3 months ago
- LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs☆1,460Updated last week
- [ICML 2024] Official repository for "Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models"☆678Updated 3 months ago
- ☆1,878Updated last week
- AIDE: the state-of-the-art machine learning engineer agent, generating machine learning solution code from natural language descriptions.☆558Updated this week
- Codebase for Aria - an Open Multimodal Native MoE☆779Updated this week
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!☆3,215Updated 2 months ago
- AIOS: LLM Agent Operating System☆3,390Updated this week