The dataset includes screen summaries that describes Android app screenshot's functionalities. It is used for training and evaluation of the screen2words models (our paper accepted by UIST'21 will be linked soon).
☆63Jul 27, 2021Updated 4 years ago
Alternatives and similar repositories for screen2words
Users that are interested in screen2words are comparing it to the libraries listed below
Sorting:
- The dataset includes widget captions that describes UI element's functionalities. It is used for training and evaluation of the widget ca…☆23Jun 24, 2021Updated 4 years ago
- It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item …☆47Aug 2, 2021Updated 4 years ago
- The Screen Annotation dataset consists of pairs of mobile screenshots and their annotations. The annotations are in text format, and desc…☆84Mar 7, 2024Updated last year
- This repository contains the opensource version of the datasets were used for different parts of training and testing of models that grou…☆34Aug 20, 2020Updated 5 years ago
- The dataset includes UI object type labels (e.g., BUTTON, IMAGE, CHECKBOX) that describes the semantic type of an UI object on Android ap…☆53Jan 14, 2022Updated 4 years ago
- Screen2Vec is a new self-supervised technique for generating more comprehensive semantic embeddings of GUI screens and components using t…☆81Feb 3, 2025Updated last year
- Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments☆61Aug 19, 2024Updated last year
- A mobile GUI search engine using a vision-language model☆14May 5, 2025Updated 9 months ago
- VINS: Visual Search for Mobile User Interface Design☆49Jan 9, 2021Updated 5 years ago
- ScreenQA dataset was introduced in the "ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots" paper. It contains ~86K …☆139Feb 7, 2025Updated last year
- ☆129Dec 4, 2023Updated 2 years ago
- An accurate GUI element detection approach based on old-fashioned CV algorithms [Upgraded on 5/July/2021]☆525Nov 8, 2023Updated 2 years ago
- ☆31Sep 27, 2024Updated last year
- ☆15May 14, 2024Updated last year
- [WWW2024 Oral] Harnessing Multi-Role Capabilities of Large Language Models for Open-Domain Question Answering☆15Apr 22, 2025Updated 10 months ago
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆19Nov 10, 2023Updated 2 years ago
- Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?☆128Feb 20, 2024Updated 2 years ago
- ☆23Oct 11, 2024Updated last year
- Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"☆28Jul 31, 2024Updated last year
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆64Oct 19, 2024Updated last year
- LINEBot☆13Apr 7, 2025Updated 10 months ago
- ☆32May 29, 2025Updated 9 months ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Aug 20, 2025Updated 6 months ago
- (ICLR 2025) The Official Code Repository for GUI-World.☆68Dec 18, 2024Updated last year
- Consists of ~500k human annotations on the RICO dataset identifying various icons based on their shapes and semantics, and associations b…☆34Jun 27, 2024Updated last year
- Flow Chart Image-to-Code Generation☆36Aug 13, 2023Updated 2 years ago
- This repository contains an implementation of the 3D watermarking algorithm proposed by Cayre et al based on Spectral Decomposition.☆11Jun 3, 2018Updated 7 years ago
- ☆36Jan 22, 2019Updated 7 years ago
- data analyst code automaton☆24Mar 23, 2025Updated 11 months ago
- indoor navigation attempts with RSSI fingerprinting and trilateration☆12Nov 28, 2018Updated 7 years ago
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆37Sep 19, 2023Updated 2 years ago
- A general probabilistic graphical models framework for Rust☆10May 16, 2018Updated 7 years ago
- A Fruit Slicing Android Game made with C# on Unity Game Engine.☆11May 21, 2019Updated 6 years ago
- A Web Interface for Aurora☆24Oct 8, 2013Updated 12 years ago
- Food Recommendation ChatBot☆10Dec 23, 2016Updated 9 years ago
- Reference CAD files for NEON integrations☆10Aug 9, 2024Updated last year
- Implementatio of a SocketIO for Node-RED☆11Feb 12, 2022Updated 4 years ago
- A Gaze Tracker using a Hourglass Convolutional Neural Network☆10Dec 4, 2018Updated 7 years ago
- ☆13Nov 9, 2025Updated 3 months ago