microsoft / WindowsAgentArena
Windows Agent Arena (WAA) πͺ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
β483Updated this week
Related projects β
Alternatives and complementary repositories for WindowsAgentArena
- Agent S: an open agentic framework that uses computers like a humanβ606Updated this week
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhanβ¦β496Updated 5 months ago
- Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and β¦β328Updated 5 months ago
- β407Updated last month
- AG2 (formerly AutoGen): The Open-Source AgentOS. Join the community at: https://discord.gg/pAbnFJrkgZβ527Updated this week
- AWM: Agent Workflow Memoryβ205Updated last month
- β316Updated last month
- CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/β191Updated last week
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β647Updated last week
- Automated Design of Agentic Systemsβ1,038Updated this week
- An out-of-the-box (OOTB) version of Anthropic Claude Computer Use for Windows and macOSβ345Updated this week
- Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"β342Updated 8 months ago
- agent q - oss advanced reasoning and learning for autonomous ai agentsβ350Updated last month
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ166Updated this week
- β294Updated 5 months ago
- Agentlessπ±: an agentless approach to automatically solve software development problemsβ723Updated last week
- AI for all: Build the large graph of the language modelsβ244Updated 5 months ago
- OpenResearcher, an advanced Scientific Research Assistantβ408Updated last month
- [EMNLP 2024 Demo] TinyAgent: Function Calling at the Edge!β313Updated 2 months ago
- An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through inβ¦β589Updated last month
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.β431Updated last week
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RLβ204Updated this week
- Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffoldingβ327Updated 9 months ago
- Agent driven automation starting with the web. Discord: https://discord.gg/wgNfmFuqJFβ818Updated this week
- Code and Data for Tau-Benchβ201Updated 3 weeks ago
- Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engineβ356Updated last month
- A compilation of the best multi-agent papersβ258Updated 2 weeks ago
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineeringβ517Updated 2 weeks ago
- Autonomous Agents (LLMs) research papers. Updated Daily.β515Updated last week
- multi1: create o1-like reasoning chains with multiple AI providers (and locally). Supports LiteLLM as backend too for 100+ providers at oβ¦β314Updated last month