[ICLR 2026] Computer Agent Arena: Toward Human-Centric Evaluation and Analysis of Computer-Use Agents
☆58Feb 26, 2026Updated last month
Alternatives and similar repositories for computer-agent-arena
Users that are interested in computer-agent-arena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Extending context length of visual language models☆12Dec 18, 2024Updated last year
- Code for paper ”Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability“☆15Jun 13, 2023Updated 2 years ago
- [EMNLP'23] Code for Generating Data for Symbolic Language with Large Language Models☆18Oct 21, 2023Updated 2 years ago
- 🍨 Gelato — From Data Curation to Reinforcement Learning: Building a Strong Grounding Model for Computer-Use Agents☆40Dec 22, 2025Updated 3 months ago
- ☆21May 24, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- [NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis☆157Nov 6, 2025Updated 4 months ago
- ☆27Feb 26, 2023Updated 3 years ago
- Code and data for "ImgTrojan: Jailbreaking Vision-Language Models with ONE Image"☆24Mar 26, 2025Updated last year
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments☆2,689Mar 17, 2026Updated last week
- Using multiple LLMs for ensemble Forecasting☆16Jan 17, 2024Updated 2 years ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆64Jul 8, 2024Updated last year
- A Survey on Interpretable Cross-modal Reasoning☆15Oct 12, 2023Updated 2 years ago
- Self-hosted GPT-4V api☆27Nov 6, 2023Updated 2 years ago
- ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World☆25Jun 17, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.☆525Nov 7, 2025Updated 4 months ago
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction☆383Mar 7, 2025Updated last year
- ☆27Jul 23, 2025Updated 8 months ago
- ☆25Aug 23, 2024Updated last year
- Lucene open-domain QA retrieval in python☆11Feb 18, 2021Updated 5 years ago
- ☆27Mar 27, 2025Updated last year
- Paper collections of methods that using language to interact with environment, including interact with real world, simulated world or WWW…☆129Jul 26, 2023Updated 2 years ago
- Empirical Study Towards Building An Effective Multi-Modal Large Language Model☆22Oct 25, 2023Updated 2 years ago
- ☆17Feb 20, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ICLR2025 Spotlight] Agent Trajectory Synthesis via Guiding Replay with Web Tutorials☆53Feb 21, 2025Updated last year
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆37Dec 29, 2024Updated last year
- 🤖ConvRe🤯: An Investigation of LLMs’ Inefficacy in Understanding Converse Relations (EMNLP 2023)☆24Oct 10, 2023Updated 2 years ago
- [Findings of EMNLP22] From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models☆19Mar 16, 2023Updated 3 years ago
- [ICLR 2025] Code for the paper "Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning"☆87Feb 14, 2025Updated last year
- OpenReivew Submission Visualization (ICLR 2024/2025)☆154Oct 17, 2024Updated last year
- [ICLR 2025] Code for the paper "Implicit Search via Discrete Diffusion: A Study on Chess"☆37Mar 3, 2025Updated last year
- Can Language Models Solve Olympiad Programming?☆123Jan 14, 2025Updated last year
- GUICourse: From General Vision Langauge Models to Versatile GUI Agents☆137Mar 1, 2026Updated 3 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts …☆13Jul 13, 2022Updated 3 years ago
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆49Dec 22, 2023Updated 2 years ago
- ☆21May 5, 2020Updated 5 years ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large mult…☆839Feb 3, 2025Updated last year
- Code for preprint: Summarizing Differences between Text Distributions with Natural Language☆43Feb 24, 2023Updated 3 years ago
- [ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"☆79Nov 25, 2024Updated last year
- Exploring Few-Shot Adaptation of Language Models with Tables☆24Aug 22, 2022Updated 3 years ago