☆30Apr 16, 2024Updated 2 years ago
Alternatives and similar repositories for assistgui
Users that are interested in assistgui are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository of GUI Action Narrator☆13Apr 8, 2025Updated last year
- Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.☆122Jul 27, 2025Updated 10 months ago
- The model, data and code for the visual GUI Agent SeeClick☆482Jul 13, 2025Updated 10 months ago
- [NeurIPS 2024 D&B] VideoGUI: A Benchmark for GUI Automation from Instructional Videos☆52Feb 22, 2026Updated 3 months ago
- [CVPR 2026] Official repo for "VideoSSR: Video Self-Supervised Reinforcement Learning"☆38Nov 11, 2025Updated 6 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 🌐 WebCode: Claude Code in your web browser (without server!)☆25Jul 31, 2025Updated 9 months ago
- API Blueprint for Hologram's REST API☆12Dec 8, 2023Updated 2 years ago
- Pytorch implementation of MoLA☆23Jun 9, 2025Updated 11 months ago
- A web page annotation toolbar to help you better interact with AI coding agents.☆49Feb 21, 2026Updated 3 months ago
- ☆15Nov 3, 2022Updated 3 years ago
- ☆32Sep 27, 2024Updated last year
- Official Project Webpage for paper "DiffSRL: Learning Dynamic-aware State Representation for Control via Differentiable Simulation"☆12Apr 4, 2022Updated 4 years ago
- ☆13Jun 14, 2023Updated 2 years ago
- Implementation of the paper: Going Full-TILT Boogie on Document Understanding with Text-Image-Layout Transformer.☆18Apr 23, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code and Dataset for our CVPR 2022 paper "Video Shadow Detection via Spatio-Temporal Interpolation Consistency Training"☆12Jul 8, 2022Updated 3 years ago
- [ICLR 2026] - One2Scene☆42Feb 26, 2026Updated 3 months ago
- 💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.☆1,191Aug 17, 2025Updated 9 months ago
- GUICourse: From General Vision Langauge Models to Versatile GUI Agents☆141Mar 1, 2026Updated 2 months ago
- Under construction☆13Jan 15, 2025Updated last year
- The dataset includes screen summaries that describes Android app screenshot's functionalities. It is used for training and evaluation of …☆67Jul 27, 2021Updated 4 years ago
- [ICLR 2025] A trinity of environments, tools, and benchmarks for general virtual agents☆232Jun 16, 2025Updated 11 months ago
- [ICLR 2026] SparseD: Sparse Attention for Diffusion Language Models☆65Feb 22, 2026Updated 3 months ago
- officical code for ECCV 2024 paper "Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection"☆14Jul 4, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆17Oct 30, 2023Updated 2 years ago
- A mobile GUI search engine using a vision-language model☆14May 5, 2025Updated last year
- [CVPR 2024 CVinW] Multi-Agent VQA: Exploring Multi-Agent Foundation Models on Zero-Shot Visual Question Answering☆22Sep 21, 2024Updated last year
- Pythonic wrappers for Cider/CiderD evaluation metrics. Provides CIDEr as well as CIDEr-D (CIDEr Defended) which is more robust to gaming …☆13Dec 4, 2025Updated 5 months ago
- ☆18Nov 1, 2024Updated last year
- Implementation of ResiDualGAN and DRDG☆14Apr 15, 2024Updated 2 years ago
- Official implementation of "MedITok: A Unified Tokenizer for Medical Image Synthesis and Interpretation"☆28Apr 3, 2026Updated last month
- Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments☆61Aug 19, 2024Updated last year
- A curated list of cutting-edge research papers and resources on Long Chain-of-Thought (CoT) Reasoning with Tools.☆47Dec 17, 2025Updated 5 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆16Jun 9, 2020Updated 5 years ago
- [ACM MM 2022] Target-Driven Structured Transformer Planner for Vision-Language Navigation☆16Nov 1, 2022Updated 3 years ago
- A helper allows you to manage your deep learning model‘s parameters in a convenient way.☆11Nov 25, 2020Updated 5 years ago
- Brings the <a href> to Gio. Small library to open URLs, supports Android, Windows, Linux, FreeBSD, iOS, macOS and WASM☆12Sep 3, 2022Updated 3 years ago
- ☆40Apr 9, 2026Updated last month
- Source code for CVPR 2023 DyLiN paper☆22Dec 14, 2023Updated 2 years ago
- This repository contains the code of our paper 'Skip \n: A simple method to reduce hallucination in Large Vision-Language Models'.☆15Feb 12, 2024Updated 2 years ago