This repository contains the opensource version of the datasets were used for different parts of training and testing of models that ground natural language to UI actions as described in the paper: "Mapping Natural Language Instructions to Mobile UI Action Sequences" by Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, and Jason Baldridge, which is acc…
☆34Aug 20, 2020Updated 5 years ago
Alternatives and similar repositories for seq2act
Users that are interested in seq2act are comparing it to the libraries listed below
Sorting:
- The dataset includes UI object type labels (e.g., BUTTON, IMAGE, CHECKBOX) that describes the semantic type of an UI object on Android ap…☆53Jan 14, 2022Updated 4 years ago
- A Universal Platform for Training and Evaluation of Mobile Interaction☆60Sep 24, 2025Updated 5 months ago
- [ICCV 2025] GUIOdyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUIOdyssey consists of 8,834 e…☆147Jan 3, 2026Updated last month
- It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item …☆47Aug 2, 2021Updated 4 years ago
- Under construction☆13Jan 15, 2025Updated last year
- RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs☆19Feb 8, 2026Updated 3 weeks ago
- Graph Convolutional Network on data from Elliptic bitcoin dataset of transactions graph☆16Oct 29, 2019Updated 6 years ago
- ☆23Aug 29, 2023Updated 2 years ago
- CGAT: Channel-aware Graph Attention Networks☆20Mar 24, 2023Updated 2 years ago
- Recognize graphic user interface layout through grouping GUI elements according to their visual attributes☆49Jun 17, 2022Updated 3 years ago
- AndroidWorld is an environment and benchmark for autonomous agents☆635Feb 20, 2026Updated last week
- The dataset includes widget captions that describes UI element's functionalities. It is used for training and evaluation of the widget ca…☆23Jun 24, 2021Updated 4 years ago
- ☆20Apr 24, 2024Updated last year
- UICrit is a dataset containing human-generated natural language design critiques, corresponding bounding boxes for each critique, and des…☆26Nov 19, 2024Updated last year
- ☆10Nov 22, 2022Updated 3 years ago
- Camera-based Document Analysis☆26Jul 7, 2025Updated 7 months ago
- ☆36Oct 7, 2023Updated 2 years ago
- A dataset consisting of 502 English dialogs with 12,000 annotated utterances between a user and an assistant discussing movie preferences…☆28Jan 20, 2021Updated 5 years ago
- ☆31Sep 27, 2024Updated last year
- Benchmarking Mobile Device Control Agents across Diverse Configurations (ICLR 2024 workshop GenAI4DM spotlight presentation; CoLLAs 2025)☆35Jul 21, 2025Updated 7 months ago
- Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?☆128Feb 20, 2024Updated 2 years ago
- The model, data and code for the visual GUI Agent SeeClick☆467Jul 13, 2025Updated 7 months ago
- LunarVR is a virtual reality application made for NASA SPACE APPPS CHALLENGE 2018. This project was awarded as Global Winner in Best use …☆12Feb 7, 2023Updated 3 years ago
- DroidAgent: Intent-Driven Mobile GUI Testing with Autonomous LLM Agents☆58Mar 12, 2024Updated last year
- [SCIS] MULTI-Benchmark: Multimodal Understanding Leaderboard with Text and Images☆44Nov 19, 2025Updated 3 months ago
- ☆57Aug 10, 2025Updated 6 months ago
- A simple Streamlit frontend for a pre-trained MobileNet CNN model + OpenCV for face mask detection in images.☆10Mar 25, 2023Updated 2 years ago
- 生成训练文本检测数据集☆12Jul 1, 2020Updated 5 years ago
- A template for a Djinni library that can be used in Java/Kotlin, ObjC/Swift and C#☆11Oct 6, 2022Updated 3 years ago
- Continuous quality evaluation of ML algorithms via CI/CD and GitHub Actions.☆16Jan 15, 2020Updated 6 years ago
- A large-scale training and benchmarking framework for rPPG.☆10Nov 26, 2024Updated last year
- MiniWoB++: a web interaction benchmark for reinforcement learning☆371May 5, 2025Updated 9 months ago
- ☆35Mar 24, 2023Updated 2 years ago
- A fast pure-Python search engine☆12Apr 9, 2009Updated 16 years ago
- Python code to break SVG files into polygon objects consumable in Tableau.☆10Mar 16, 2020Updated 5 years ago
- The official implementation of InterBERT☆11Oct 18, 2022Updated 3 years ago
- Owl Eyes: Spotting UI Display Issues via Visual Understanding☆11Jul 31, 2020Updated 5 years ago
- An environment for mobile angets to interact with realistic android device or android emulator☆13Jul 19, 2024Updated last year
- Home page for Microsoft Phi-Ground tech-report☆23Sep 8, 2025Updated 5 months ago