We release the UICaption dataset. The dataset consists of UI images (icons and screenshots) and associated text descriptions. This dataset was used to pre-train the Lexi model which provides a generic representation of UI screens and their components.
☆42Nov 29, 2022Updated 3 years ago
Alternatives and similar repositories for UICaption
Users that are interested in UICaption are comparing it to the libraries listed below
Sorting:
- MacTok is a research prototype for a one-time anonymous token scheme based on algebraic MACs.☆23Jan 20, 2023Updated 3 years ago
- DeFacto - Demonstrations and Feedback for improving factual consistency of text summarization☆30Dec 19, 2022Updated 3 years ago
- VINS: Visual Search for Mobile User Interface Design☆51Jan 9, 2021Updated 5 years ago
- Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments☆61Aug 19, 2024Updated last year
- A self-supervised learning approach based on extremely large masking☆31Dec 19, 2022Updated 3 years ago
- Code that accompanies the PyData New York (2022) talk: Addressing the sensitivity of Large language models☆13Nov 7, 2022Updated 3 years ago
- This repo is the official implementation of "Mask-based Latent Reconstruction for Reinforcement Learning" (NeurIPS 2022).☆29Jul 6, 2023Updated 2 years ago
- [NeurIPS 2022] code for "K-LITE: Learning Transferable Visual Models with External Knowledge" https://arxiv.org/abs/2204.09222☆53Jun 12, 2023Updated 2 years ago
- Gallery for Industry AI demos☆18May 1, 2023Updated 2 years ago
- MySQL Tools Service that provides MySQL Server data management capabilities.☆22Jun 11, 2024Updated last year
- [Arxiv2022] Revitalize Region Feature for Democratizing Video-Language Pre-training☆22Mar 19, 2022Updated 4 years ago
- ☆16Oct 1, 2020Updated 5 years ago
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- Code for the paper "LASER: LLM Agent with State-Space Exploration for Web Navigation"☆35Sep 26, 2023Updated 2 years ago
- Azure Object Detection Accelerator. A repo for quickly and easily setting up a sample object detection project with training, labelling, …☆20May 23, 2023Updated 2 years ago
- PANENE: Progressive Approximate NEarest NEighbors☆20Feb 12, 2025Updated last year
- 📖 UI/UX context detection engine☆12Jan 3, 2021Updated 5 years ago
- The dataset includes widget captions that describes UI element's functionalities. It is used for training and evaluation of the widget ca…☆23Jun 24, 2021Updated 4 years ago
- KittenTTS is an ultra-lightweight, CPU-friendly text-to-speech model with 15M params for real-time, high-quality voices. Open source, fas…☆24Updated this week
- ☆23Jun 7, 2023Updated 2 years ago
- A repository for managing workshop contents for learning Microsoft Azure's data analytics platform with a focus on Databricks SQL and Syn…☆21Jul 4, 2023Updated 2 years ago
- Code and utilities for creating a Vision-and-Language Navigation (VLN) simulator environment from a physical space.☆12Nov 10, 2020Updated 5 years ago
- A simple voice conversion tool☆20Mar 10, 2022Updated 4 years ago
- Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments☆12Nov 29, 2021Updated 4 years ago
- An implementation of various color transfer algorithms.☆26Apr 22, 2025Updated 10 months ago
- ☆13Aug 10, 2024Updated last year
- Large-Scale Bidirectional Training for Zero-Shot Image Captioning☆21Feb 14, 2023Updated 3 years ago
- Research unikernel for virtualized services☆53Dec 6, 2022Updated 3 years ago
- A collection of papers I am interested in.☆29Apr 3, 2023Updated 2 years ago
- Python repo for the XDK auto-generated code.☆22Feb 28, 2026Updated 2 weeks ago
- Matterport and Unreal Engine Extension for Omniverse Isaac Sim☆20May 9, 2024Updated last year
- In-IDE Code Search☆29Apr 29, 2022Updated 3 years ago
- ☆22Feb 22, 2024Updated 2 years ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆11Jul 22, 2023Updated 2 years ago
- NetPassage allows you to expose a web service, such as Microsoft Bot running on your local machine or on the private network to the publi…☆15Jul 20, 2023Updated 2 years ago
- ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation. AAAI, 2025☆13Aug 25, 2025Updated 6 months ago
- ☆12Mar 1, 2022Updated 4 years ago
- Audio to Audio (Whisper+ChatGPT+Bark)☆11Apr 30, 2023Updated 2 years ago
- This is the official repository for MAGIC: Meta-Ability Guided Interactive Chain-of-Distillation Learning towards Efficient Vision-and-La…☆14Jun 6, 2024Updated last year