kvyb / Segmentation-of-web-UI-elements-with-Detectron2
Detectron2 Webserver (Faster-RCNN) implementation for Ubuntu 20.04. Real time object detection served over the internet.
☆28Updated last year
Related projects: ⓘ
- Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination?☆119Updated 7 months ago
- A pre labelled dataset for ui element / layout detection☆48Updated last year
- A tidied repo for UI2CODE, a reverse engineering system convert UI design to code automatically and precisely.☆41Updated 3 years ago
- Recognize graphic user interface layout through grouping GUI elements according to their visual attributes☆31Updated 2 years ago
- An accurate GUI element detection approach based on old-fashioned CV algorithms [Upgraded on 5/July/2021]☆365Updated 10 months ago
- It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item …☆41Updated 3 years ago
- ☆90Updated 9 months ago
- Object Detection Model for Scanned Documents☆77Updated 11 months ago
- VINS: Visual Search for Mobile User Interface Design☆26Updated 3 years ago
- The dataset includes screen summaries that describes Android app screenshot's functionalities. It is used for training and evaluation of …☆44Updated 3 years ago
- Input text or image, get back matching image fashion results, using Jina, DocArray, and CLIP☆47Updated 2 years ago
- The dataset includes UI object type labels (e.g., BUTTON, IMAGE, CHECKBOX) that describes the semantic type of an UI object on Android ap…☆44Updated 2 years ago
- A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!☆88Updated last year
- ☆36Updated 5 years ago
- The dataset includes widget captions that describes UI element's functionalities. It is used for training and evaluation of the widget ca…☆16Updated 3 years ago
- The Screen Annotation dataset consists of pairs of mobile screenshots and their annotations. The annotations are in text format, and desc…☆46Updated 6 months ago
- My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"☆68Updated last week
- App that generates a JSON file containing the details of various UI components from the screenshot of a wireframe. [Submission for Smart …☆17Updated 3 years ago
- Figma Files Scraper for Research & Studies☆21Updated last year
- An JS web client for connecting to Pipecat bots with voice and vision☆30Updated 2 months ago
- A curated mobile app design database☆53Updated 2 years ago
- ☆18Updated 6 months ago
- Sample implementation of natural language image search with OpenAI's CLIP and Elasticsearch or Opensearch.☆63Updated 2 years ago
- Passively collect images for computer vision datasets on the edge.☆27Updated 11 months ago
- Flask-based web application designed to compare text and image embeddings using the CLIP model.☆21Updated 7 months ago
- GPT-4V in Wonderland: LMMs as Smartphone Agents☆122Updated 2 months ago
- DocLLM: A layout-aware generative language model for multimodal document understanding☆109Updated 8 months ago
- This repository contains the opensource version of the datasets were used for different parts of training and testing of models that grou…☆29Updated 4 years ago
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions☆38Updated 5 months ago
- Automating Android apps with ChatGPT-like LLM.☆91Updated 8 months ago