showlab/assistgui

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/showlab/assistgui)

showlab / assistgui

☆30

Alternatives and similar repositories for assistgui

Users that are interested in assistgui are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

showlab / GUI-Narrator
View on GitHub
Repository of GUI Action Narrator
☆13Apr 8, 2025Updated last year
njucckevin / SeeClick
View on GitHub
The model, data and code for the visual GUI Agent SeeClick
☆493Jul 13, 2025Updated last year
VITA-Group / TTC-Net
View on GitHub
[ICML'26] Beyond Test-Time Memory: State-Space Optimal Control for LLM Reasoning
☆15Jun 1, 2026Updated last month
showlab / WorldGUI
View on GitHub
Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.
☆124Jul 27, 2025Updated last year
uakarsh / TiLT-Implementation
View on GitHub
Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.
☆18Apr 23, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
yifanQi98 / HGRL
View on GitHub
☆15Nov 3, 2022Updated 3 years ago
YuxiangChai / AMEX-codebase
View on GitHub
☆33Sep 27, 2024Updated last year
showlab / videogui
View on GitHub
[NeurIPS 2024 D&B] VideoGUI: A Benchmark for GUI Automation from Instructional Videos
☆53Feb 22, 2026Updated 5 months ago
SakanaAI / fast-weight-product-key-memory
View on GitHub
Code for Fast-weight Product Key Memory (FwPKM)
☆19Mar 18, 2026Updated 4 months ago
HKUST-LongGroup / DyME
View on GitHub
[ICLR 2026] Empowering Small VLMs to Think with Dynamic Memorization and Exploration
☆18Mar 18, 2026Updated 4 months ago
showlab / Awesome-GUI-Agent
View on GitHub
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
☆1,201Aug 17, 2025Updated 11 months ago
RUCBM / GUICourse
View on GitHub
GUICourse: From General Vision Langauge Models to Versatile GUI Agents
☆143Mar 1, 2026Updated 4 months ago
DripNowhy / Octopus
View on GitHub
[ICML 2026] Official implementation for paper: Learning Self-Correction in Vision–Language Models via Rollout Augmentation
☆16Jun 4, 2026Updated last month
lihaoliu-cambridge / video-shadow-detection
View on GitHub
A Pytorch Lightning implementation of “Triple-cooperative Video Shadow Detection” on CVPR'21.
☆13Sep 1, 2023Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
StarWalkin / UI-NEXUS
View on GitHub
This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulin…
☆14Jul 27, 2025Updated last year
SakanaAI / L2D
View on GitHub
Large language models to diffusion finetuning code
☆26Jun 2, 2025Updated last year
boyugou / llava_uground
View on GitHub
☆18Nov 1, 2024Updated last year
GradiusTwinbee / GLIS
View on GitHub
officical code for ECCV 2024 paper "Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection"
☆14Jul 4, 2024Updated 2 years ago
ii-research / RAG_Overview
View on GitHub
TBA
☆15Aug 19, 2025Updated 11 months ago
wdchenxyz / CNN2
View on GitHub
Code for "CNN^2: Viewpoint Generalization via a Binocular Vision" (NeurIPS 2019)
☆11Aug 7, 2021Updated 4 years ago
bowen-upenn / Multi-Agent-VQA
View on GitHub
[CVPR 2024 CVinW] Multi-Agent VQA: Exploring Multi-Agent Foundation Models on Zero-Shot Visual Question Answering
☆22Sep 21, 2024Updated last year
SceneDroid / SceneDroid
View on GitHub
☆17Oct 30, 2023Updated 2 years ago
Heng14 / DyLiN
View on GitHub
Source code for CVPR 2023 DyLiN paper
☆22Dec 14, 2023Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Singularity42 / Sync-DRAW
View on GitHub
Sync-DRAW: Automatic Video Generation using Deep Recurrent Attentive Architectures
☆12Oct 21, 2017Updated 8 years ago
michelecafagna26 / cider
View on GitHub
Pythonic wrappers for Cider/CiderD evaluation metrics. Provides CIDEr as well as CIDEr-D (CIDEr Defended) which is more robust to gaming …
☆13Dec 4, 2025Updated 7 months ago
miemieyanga / ResiDualGAN-DRDG
View on GitHub
Implementation of ResiDualGAN and DRDG
☆14Apr 15, 2024Updated 2 years ago
YushengZhao / TD-STP
View on GitHub
[ACM MM 2022] Target-Driven Structured Transformer Planner for Vision-Language Navigation
☆16Nov 1, 2022Updated 3 years ago
aburns4 / textualforesight
View on GitHub
☆12Aug 8, 2024Updated last year
X-LANCE / weblm
View on GitHub
[WSDM 2024] Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding
☆18Mar 6, 2024Updated 2 years ago
lihaoliu-cambridge / deep-learning-model-saving-helper
View on GitHub
A helper allows you to manage your deep learning model‘s parameters in a convenient way.
☆11Nov 25, 2020Updated 5 years ago
aburns4 / MoTIF
View on GitHub
Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments
☆61Aug 19, 2024Updated last year
allenai / hyperdecoders
View on GitHub
Codebase for Hyperdecoders https://arxiv.org/abs/2203.08304
☆14Oct 11, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
raktimgg / RoboPEPP
View on GitHub
☆25Jun 12, 2026Updated last month
Han-Zongbo / Skip-n
View on GitHub
This repository contains the code of our paper 'Skip \n: A simple method to reduce hallucination in Large Vision-Language Models'.
☆15Feb 12, 2024Updated 2 years ago
desbma / pyfastcopy
View on GitHub
Speed up Python's shutil.copyfile by using sendfile system call
☆11Aug 2, 2018Updated 7 years ago
chuyg1005 / seeclick-crawler
View on GitHub
☆20Apr 24, 2024Updated 2 years ago
vios-s / multimodal_segmentation
View on GitHub
Code for Disentangle Align and Fuse for Multimodal and Zero-shot Image Segmentation
☆14Sep 26, 2020Updated 5 years ago
TencentAILabHealthcare / UMIX
View on GitHub
☆18Oct 29, 2022Updated 3 years ago
deerishi / graph-based-semi-supervised-learning
View on GitHub
This project explores the different techniques (both scalable and non scalable) for Graph based semi supervised learning. Recent techniqu…
☆14May 28, 2016Updated 10 years ago