computer-agents / agent-studio
Environments, tools, and benchmarks for general computer agents
β172Updated 2 weeks ago
Related projects β
Alternatives and complementary repositories for agent-studio
- π€ Agent-as-a-Judge and DevAI datasetβ184Updated last week
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ133Updated this week
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RLβ166Updated this week
- AWM: Agent Workflow Memoryβ203Updated last month
- Official Repo for UGroundβ93Updated this week
- β102Updated 2 months ago
- β283Updated last month
- β116Updated 5 months ago
- ControlLLM: Augment Language Models with Tools by Searching on Graphsβ186Updated 3 months ago
- This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgenβ¦β199Updated 3 months ago
- CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/β187Updated this week
- Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and β¦β328Updated 4 months ago
- An Analytical Evaluation Board of Multi-turn LLM Agentsβ245Updated 5 months ago
- An implemtation of Everyting of Thoughts (XoT).β129Updated 8 months ago
- β152Updated 2 months ago
- [ACL 2024] AUTOACT: Automatic Agent Learning from Scratch for QA via Self-Planningβ177Updated last month
- AI for all: Build the large graph of the language modelsβ238Updated 5 months ago
- The model, data and code for the visual GUI Agent SeeClickβ216Updated 2 months ago
- SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasksβ277Updated 3 weeks ago
- Code and implementations for the paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi eβ¦β346Updated 2 months ago
- VisualWebArena is a benchmark for multimodal agents.β236Updated this week
- Official implementation of paper "On the Diagram of Thought" (https://arxiv.org/abs/2409.10038)β169Updated last month
- Beating the GAIA benchmark with Transformers Agents. πβ62Updated 2 weeks ago
- WebLINX is a benchmark for building web navigation agents with conversational capabilitiesβ115Updated last month
- FireAct: Toward Language Agent Fine-tuningβ254Updated last year
- This is a collection of resources for computer-use agents, including videos, blogs, papers, and projects.β85Updated this week
- β35Updated last year
- β311Updated last month
- AndroidWorld is an environment and benchmark for autonomous agentsβ125Updated this week
- β72Updated 10 months ago