The Screen Annotation dataset consists of pairs of mobile screenshots and their annotations. The annotations are in text format, and describe the UI elements present on the screen: their type, location, OCR text and a short description. It has been introduced in the paper `ScreenAI: A Vision-Language Model for UI and Infographics Understanding`.
☆84Mar 7, 2024Updated last year
Alternatives and similar repositories for screen_annotation
Users that are interested in screen_annotation are comparing it to the libraries listed below
Sorting:
- ScreenQA dataset was introduced in the "ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots" paper. It contains ~86K …☆139Feb 7, 2025Updated last year
- The dataset includes screen summaries that describes Android app screenshot's functionalities. It is used for training and evaluation of …☆63Jul 27, 2021Updated 4 years ago
- ☆33Oct 1, 2024Updated last year
- Consists of ~500k human annotations on the RICO dataset identifying various icons based on their shapes and semantics, and associations b…☆34Jun 27, 2024Updated last year
- Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"☆378Feb 6, 2026Updated 3 weeks ago
- ☆31Sep 27, 2024Updated last year
- The model, data and code for the visual GUI Agent SeeClick☆467Jul 13, 2025Updated 7 months ago
- A pre labelled dataset for ui element / layout detection☆67Jun 15, 2023Updated 2 years ago
- GUICourse: From General Vision Langauge Models to Versatile GUI Agents☆136Updated this week
- ☆20Apr 24, 2024Updated last year
- Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments☆61Aug 19, 2024Updated last year
- It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item …☆47Aug 2, 2021Updated 4 years ago
- ☆129Dec 4, 2023Updated 2 years ago
- A modern Python library to work with Anoto dot patterns.☆16Aug 24, 2023Updated 2 years ago
- Unoffical Pytorch Implementation of Improving Inference for Neural Image Compression☆15Apr 27, 2025Updated 10 months ago
- [EMNLP 2022] The baseline code for META-GUI dataset☆14Jul 9, 2024Updated last year
- Custom object detection for UI of the design system using TensorFlow☆16Jun 20, 2023Updated 2 years ago
- ☆18Mar 20, 2022Updated 3 years ago
- The dataset includes widget captions that describes UI element's functionalities. It is used for training and evaluation of the widget ca…☆23Jun 24, 2021Updated 4 years ago
- [NeurIPS 2025] UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents☆53Nov 27, 2025Updated 3 months ago
- Official Implementation of CL-ALFRED (ICLR'24)☆30Oct 24, 2024Updated last year
- Official code accompanying the arXiv paper Compressing Multisets with Large Alphabets☆30Sep 22, 2021Updated 4 years ago
- A dataset for Audio-Visual Sound Event Detection in Movies☆26Jan 23, 2023Updated 3 years ago
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents☆300Jul 18, 2025Updated 7 months ago
- ZeroGUI: Automating Online GUI Learning at Zero Human Cost☆110Jul 17, 2025Updated 7 months ago
- ☆32May 17, 2024Updated last year
- ☆43Jan 18, 2025Updated last year
- Benchmarking Mobile Device Control Agents across Diverse Configurations (ICLR 2024 workshop GenAI4DM spotlight presentation; CoLLAs 2025)☆35Jul 21, 2025Updated 7 months ago
- This repository contains the opensource version of the datasets were used for different parts of training and testing of models that grou…☆34Aug 20, 2020Updated 5 years ago
- Source code for the article "Data Driven and Discriminative Projections for Large-Scale Cover Song Identification"☆38Apr 27, 2015Updated 10 years ago
- 📦 A collection of pastable code gathered from past projects☆12Sep 9, 2024Updated last year
- ☆12Sep 25, 2023Updated 2 years ago
- Scrapes high-res fashion images from Vogue Runway☆16Jan 22, 2026Updated last month
- ☆16Jul 7, 2025Updated 7 months ago
- DroidAgent: Intent-Driven Mobile GUI Testing with Autonomous LLM Agents☆58Mar 12, 2024Updated last year
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Jan 15, 2024Updated 2 years ago
- Code for "Zero-Shot Out-of-Distribution Detection with Feature Correlations"☆13Jan 19, 2020Updated 6 years ago
- Continuous quality evaluation of ML algorithms via CI/CD and GitHub Actions.