tiangeluo/RegionFocus

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tiangeluo/RegionFocus)

tiangeluo / RegionFocus

A simple visual test-time scaling method for GUI agent grounding

☆26

Alternatives and similar repositories for RegionFocus

Users that are interested in RegionFocus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

vivo / DiMo-GUI
View on GitHub
[EMNLP 2025]Repository for paper "DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning"
☆30Jul 2, 2025Updated last year
JiuTian-VL / SimpAgent
View on GitHub
[ICCV 2025 Highlight] Less is More: Empowering GUI Agent with Context-Aware Simplification
☆48Mar 12, 2026Updated 4 months ago
Wuzheng02 / OS-Kairos
View on GitHub
[ACL 2025] Research code for the paper "OS-Kairos: Adaptive Interaction for MLLM-Powered GUI Agents"
☆21Jun 19, 2025Updated last year
YXB-NKU / SE-GUI
View on GitHub
[NeurIPS 2025]"Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning"
☆108Oct 21, 2025Updated 9 months ago
StarWalkin / UI-NEXUS
View on GitHub
This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulin…
☆14Jul 27, 2025Updated 11 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
WenyiWU0111 / CoMEM-Agent
View on GitHub
Official repository for paper Auto-scaling Continuous Memory for GUI Agent
☆29Feb 2, 2026Updated 5 months ago
iLearn-Lab / ACL25-GUI-explorer
View on GitHub
[ACL 2025] GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent
☆68May 28, 2025Updated last year
RammusLeo / ScoreHOI
View on GitHub
Official repository of ScoreHOI (ICCV 2025)
☆16Dec 21, 2025Updated 7 months ago
open-compass / MMBench-GUI
View on GitHub
Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent w…
☆112Sep 8, 2025Updated 10 months ago
Yan98 / GTA1
View on GitHub
☆130Oct 3, 2025Updated 9 months ago
Yuqi-Zhou / GUI-G1
View on GitHub
☆28Sep 15, 2025Updated 10 months ago
runamu / monday
View on GitHub
[CVPR 2025] Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents
☆33Jun 3, 2025Updated last year
UCSC-VLAA / VLAA-GUI
View on GitHub
Official implementation of VLAA-GUI series
☆34Jun 20, 2026Updated last month
ZJU-REAL / UI-Zoomer
View on GitHub
☆35Apr 16, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ZJULiHongxin / UIPro
View on GitHub
Advanced GUI agents
☆16Feb 3, 2026Updated 5 months ago
ZJU-REAL / GUI-G2
View on GitHub
[AAAI 2026] GUI-G²: Gaussian Reward Modeling for GUI Grounding
☆310Apr 15, 2026Updated 3 months ago
SchlossLab / Great_Lakes_SLURM
View on GitHub
Using the Great Lakes cluster and batch computing with SLURM
☆29Feb 15, 2022Updated 4 years ago
Princeton-AI2-Lab / ZoomClick
View on GitHub
A Practical Zoom-in GUI Grounding and Behavior-Based Evaluation method.
☆25Dec 8, 2025Updated 7 months ago
showlab / FocusUI
View on GitHub
[CVPR 2026] FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection
☆35Jun 7, 2026Updated last month
meituan-longcat / R-HORIZON
View on GitHub
[ICLR'26] R-HORIZON: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?
☆27May 9, 2026Updated 2 months ago
DLR-RM / ocdit
View on GitHub
☆17May 28, 2026Updated last month
Jolieresearch / ICPF
View on GitHub
☆14Nov 26, 2025Updated 7 months ago
Lens4MLLMs / LENS
View on GitHub
☆29Feb 13, 2026Updated 5 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
YuHengsss / Q-Zoom
View on GitHub
☆15Apr 15, 2026Updated 3 months ago
chenwei746 / EEVG
View on GitHub
☆23Aug 20, 2024Updated last year
microsoft / GUI-Actor
View on GitHub
[NeurIPS'25] GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
☆410Apr 13, 2026Updated 3 months ago
UITron-hub / UItron
View on GitHub
☆67Sep 6, 2025Updated 10 months ago
Tree-Shu-Zhao / RebQ.pytorch
View on GitHub
This is the official code for the paper "Reconstruct before Query: Continual Missing Modality Learning with Decomposed Prompt Collaborati…
☆12Aug 13, 2024Updated last year
roywang021 / EOD
View on GitHub
Code for AAAI2024 paper: Towards Evidential and Class Separable Open Set Object Detection
☆12Dec 23, 2023Updated 2 years ago
chrisyxue / RCN_for_Interpretable_few_shot
View on GitHub
The source codes for Region Comparison Network for Interpretable Few-shot Image Classification
☆10Sep 17, 2020Updated 5 years ago
xxyzll / UMB
View on GitHub
UMB: Understanding Model Behavior for Open-World object Detection (NeurIPS 2024)
☆12May 26, 2024Updated 2 years ago
chxy95 / GenLV
View on GitHub
ACMMM24 - Learning A Low-Level Vision Generalist via Visual Task Prompt Arxiv - Exploring Scalable Unified Modeling for General Low-Level…
☆36Sep 27, 2025Updated 9 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Mr-Bigworth / MMCA
View on GitHub
Visual Grounding with Multi-modal Conditional Adaptation (ACMMM 2024 Oral)
☆26Jun 11, 2025Updated last year
lezhang7 / MOQAGPT
View on GitHub
[EMNLP'2023 Findings] MoqaGPT, for zero-shot multimodal question answering with LLMs
☆13Dec 28, 2024Updated last year
xlang-ai / OSWorld-G
View on GitHub
[NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis
☆172Jun 18, 2026Updated last month
mpesso1 / BlueRove_Integration
View on GitHub
Integration bluerov containing path planner, pid controller, and computer vision system
☆12Dec 9, 2022Updated 3 years ago
cvlab-columbia / paperbot
View on GitHub
PaperBot: Learning to Design Real-World Tools Using Paper
☆13Mar 15, 2024Updated 2 years ago
KHao123 / LaSe-E2V
View on GitHub
The source code for "LaSe-E2V: Towards Language-guided Semantic-Aware Event-to-Video Reconstruction"
☆10Jul 5, 2024Updated 2 years ago
alibaba / UI-Ins
View on GitHub
Official implementation of UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning
☆77Apr 20, 2026Updated 3 months ago