google-research-datasets / screen_annotation

The Screen Annotation dataset consists of pairs of mobile screenshots and their annotations. The annotations are in text format, and describe the UI elements present on the screen: their type, location, OCR text and a short description. It has been introduced in the paper `ScreenAI: A Vision-Language Model for UI and Infographics Understanding`.
63Updated last year

Alternatives and similar repositories for screen_annotation:

Users that are interested in screen_annotation are comparing it to the libraries listed below