The dataset includes screen summaries that describes Android app screenshot's functionalities. It is used for training and evaluation of the screen2words models (our paper accepted by UIST'21 will be linked soon).
☆67Jul 27, 2021Updated 4 years ago
Alternatives and similar repositories for screen2words
Users that are interested in screen2words are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item …☆48Aug 2, 2021Updated 4 years ago
- The Screen Annotation dataset consists of pairs of mobile screenshots and their annotations. The annotations are in text format, and desc…☆92Mar 7, 2024Updated 2 years ago
- The dataset includes UI object type labels (e.g., BUTTON, IMAGE, CHECKBOX) that describes the semantic type of an UI object on Android ap…☆54Jan 14, 2022Updated 4 years ago
- ☆33Sep 27, 2024Updated last year
- [AAAI2025 Oral] BiDeV: Bilateral Defusing Verification for Complex Claim Fact-Checking☆15Apr 22, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments☆61Aug 19, 2024Updated last year
- This repository contains the opensource version of the datasets were used for different parts of training and testing of models that grou…☆34Aug 20, 2020Updated 5 years ago
- ☆17May 14, 2024Updated 2 years ago
- VINS: Visual Search for Mobile User Interface Design☆53Jan 9, 2021Updated 5 years ago
- ☆17Oct 30, 2023Updated 2 years ago
- Seq2act: Mapping Natural Language Instructions to Mobile UI Action Sequences from Google research☆15Jul 13, 2020Updated 5 years ago
- An accurate GUI element detection approach based on old-fashioned CV algorithms [Upgraded on 5/July/2021]☆544Nov 8, 2023Updated 2 years ago
- ☆36May 29, 2025Updated last year
- ScreenQA dataset was introduced in the "ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots" paper. It contains ~86K …☆150Feb 7, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- This repository holds the data and code for the AndroR2 dataset of manually-reproduced bug reports for Android apps☆26Jun 11, 2021Updated 5 years ago
- ☆36Jan 22, 2019Updated 7 years ago
- Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?☆128Feb 20, 2024Updated 2 years ago
- a py3 lib for NLP & image-caption metrics : BLEU METEOR CIDEr ROUGE SPICE WMD☆14Sep 13, 2022Updated 3 years ago
- Overview of Clone Detection Tools for Java☆14Aug 23, 2025Updated 9 months ago
- A GAN-based GUI generation method☆78May 22, 2021Updated 5 years ago
- ☆30Apr 16, 2024Updated 2 years ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆68Oct 19, 2024Updated last year
- Consists of ~500k human annotations on the RICO dataset identifying various icons based on their shapes and semantics, and associations b…☆36Jun 27, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- (ICLR 2025) The Official Code Repository for GUI-World.☆69Dec 18, 2024Updated last year
- Code for paper "Prompt Engineering a Prompt Engineer" (https://arxiv.org/abs/2311.05661)☆12Aug 1, 2024Updated last year
- ☆102Dec 22, 2023Updated 2 years ago
- Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"☆383May 11, 2026Updated last month
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆37Sep 19, 2023Updated 2 years ago
- Game UI Glitch Detection via Bug Understanding☆12Jul 31, 2021Updated 4 years ago
- ☆10Aug 28, 2020Updated 5 years ago
- This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulin…☆14Jul 27, 2025Updated 10 months ago
- ☆15Apr 6, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- The model, data and code for the visual GUI Agent SeeClick☆483Jul 13, 2025Updated 11 months ago
- https://towardsdatascience.com/instance-segmentation-web-app-63016b8ed4ae☆12Mar 3, 2021Updated 5 years ago
- Recognize graphic user interface layout through grouping GUI elements according to their visual attributes☆50Jun 17, 2022Updated 3 years ago
- ☆47Apr 11, 2024Updated 2 years ago
- Urban Generative Intelligence (UGI): A Foundational Platform for Embodied Agent and Future City☆12Dec 17, 2023Updated 2 years ago
- A curated list of cutting-edge research papers and resources on Long Chain-of-Thought (CoT) Reasoning with Tools.☆47Dec 17, 2025Updated 5 months ago
- multiview-stereo is an attempt to create a 3D structures out of multiple views of 2D images. A volumetric surface is constructed entirely…☆13Jun 16, 2020Updated 5 years ago