huggingface/screensuite

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/huggingface/screensuite)

huggingface / screensuite

ScreenSuite - The most comprehensive benchmarking suite for GUI Agents!

☆144

Alternatives and similar repositories for screensuite

Users that are interested in screensuite are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

agentsea / osuniverse
View on GitHub
Benchmark of complex, multimodal desktop-oriented tasks for advanced GUI-navigation AI agents
☆24May 7, 2025Updated last year
showlab / FocusUI
View on GitHub
[CVPR 2026] FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection
☆35Jun 7, 2026Updated last month
microsoft / GUI-Actor
View on GitHub
[NeurIPS'25] GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
☆410Apr 13, 2026Updated 3 months ago
showlab / WorldGUI
View on GitHub
Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.
☆124Jul 27, 2025Updated 11 months ago
HKUNLP / ProGen
View on GitHub
[EMNLP-2022 Findings] Code for paper “ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback”.
☆27Feb 4, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
xhan77 / in-context-alignment
View on GitHub
In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning
☆34Aug 9, 2023Updated 2 years ago
showlab / ShowUI
View on GitHub
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
☆1,883Apr 24, 2026Updated 2 months ago
OSU-NLP-Group / UGround
View on GitHub
[ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents
☆315Mar 11, 2026Updated 4 months ago
huggingface / smol2operator
View on GitHub
☆136Sep 23, 2025Updated 9 months ago
niuzaisheng / ScreenExplorer
View on GitHub
ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World
☆26Jun 17, 2025Updated last year
video-reality-test / video-reality-test
View on GitHub
☆23May 5, 2026Updated 2 months ago
microsoft / FIVE-UI-Evol
View on GitHub
☆31Apr 15, 2026Updated 3 months ago
OpenGVLab / ZeroGUI
View on GitHub
ZeroGUI: Automating Online GUI Learning at Zero Human Cost
☆119Jul 17, 2025Updated last year
showlab / Awesome-GUI-Agent
View on GitHub
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
☆1,197Aug 17, 2025Updated 11 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
bin123apple / InfantAgent
View on GitHub
[NeurIPS 2025] A multimodal agent that can interact with its own PC in a multimodal manner.
☆39Apr 23, 2026Updated 2 months ago
OSU-NLP-Group / RedTeamCUA
View on GitHub
[ICLR'26 Oral] RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments
☆57Feb 9, 2026Updated 5 months ago
Khang-9966 / Computer-Browser-Phone-Use-Agent-Datasets
View on GitHub
This repository hosts a collection of datasets for training and evaluating CUA / GUI agents.
☆136Jun 16, 2026Updated last month
LZhengisme / self-infilling
View on GitHub
[ICML 2024] Self-Infilling Code Generation
☆18May 5, 2024Updated 2 years ago
huggingface / screenenv
View on GitHub
A powerful Python library for creating and managing isolated desktop environments using Docker containers.
☆453May 26, 2026Updated last month
GAIR-NLP / benbench
View on GitHub
Benchmarking Benchmark Leakage in Large Language Models
☆61May 20, 2024Updated 2 years ago
OSU-NLP-Group / Mind2Web-2
View on GitHub
[NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge
☆111May 17, 2026Updated 2 months ago
showlab / showui-pi
View on GitHub
[CVPR 2026] ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands
☆128Apr 22, 2026Updated 2 months ago
OSU-NLP-Group / Middleware
View on GitHub
Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)
☆37Dec 29, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
web-arena-x / webarena
View on GitHub
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
☆1,550Nov 26, 2025Updated 7 months ago
xlang-ai / OSWorld
View on GitHub
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
☆3,028Updated this week
JIA-Lab-research / ARPO
View on GitHub
Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay
☆162May 29, 2025Updated last year
ritzz-ai / GUI-R1
View on GitHub
Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
☆252May 5, 2025Updated last year
hcompai / surfer-h-cli
View on GitHub
Run Surfer-H agents powered by Holo1 using the Surfer-H-CLI. Includes example tasks, scripts, and configurations.
☆164Jun 25, 2026Updated 3 weeks ago
ariG23498 / fine-tune-paligemma
View on GitHub
Notebooks for fine tuning pali gemma
☆117Apr 15, 2025Updated last year
OS-Copilot / OS-Genesis
View on GitHub
[ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
☆188Oct 8, 2025Updated 9 months ago
SumilerGAO / SunGen
View on GitHub
☆28Feb 26, 2023Updated 3 years ago
nbroad1881 / strideformer
View on GitHub
Using short models to classify long texts
☆21Mar 8, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
OSU-NLP-Group / Mind2Web
View on GitHub
[NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist w…
☆1,015Nov 5, 2025Updated 8 months ago
web-arena-x / visualwebarena
View on GitHub
VisualWebArena is a benchmark for multimodal agents.
☆483Nov 9, 2024Updated last year
Timothyxxx / KVCachePapers
View on GitHub
☆20May 24, 2024Updated 2 years ago
OSU-NLP-Group / Online-Mind2Web
View on GitHub
An Illusion of Progress? Assessing the Current State of Web Agents
☆191Jun 25, 2026Updated 3 weeks ago
Essential-AI / eai-taxonomy
View on GitHub
☆59Aug 19, 2025Updated 11 months ago
huggingface / feel
View on GitHub
☆15May 26, 2026Updated last month
chuyg1005 / seeclick-crawler
View on GitHub
☆20Apr 24, 2024Updated 2 years ago