UITron-hub / UItronView external linksLinks
☆62Sep 6, 2025Updated 5 months ago
Alternatives and similar repositories for UItron
Users that are interested in UItron are comparing it to the libraries listed below
Sorting:
- GUIEvalKit: Open-source Evaluation Toolkit for GUI Agents☆19Jan 26, 2026Updated 3 weeks ago
- ☆12Aug 8, 2024Updated last year
- This code is used to get images from google maps given a GPS region or a center GPS point and a Zoom level.☆18Dec 16, 2024Updated last year
- ☆23Mar 18, 2024Updated last year
- [ICCV 2025 Highlight] Less is More: Empowering GUI Agent with Context-Aware Simplification☆41Jan 21, 2026Updated 3 weeks ago
- ZeroGUI: Automating Online GUI Learning at Zero Human Cost☆108Jul 17, 2025Updated 6 months ago
- Latest Papers, Codes and Datasets on VTG-LLMs.☆80Nov 17, 2025Updated 2 months ago
- Official Repo for "Why Settle for One? Text-to-ImageSet Generation and Evaluation"☆21Oct 1, 2025Updated 4 months ago
- ☆10Oct 5, 2022Updated 3 years ago
- OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models☆26Feb 4, 2026Updated last week
- A curated collection of resources, tools, and frameworks for developing GUI Agents.☆303Updated this week
- ☆61Aug 5, 2025Updated 6 months ago
- Official implementation for "Android in the Zoo: Chain-of-Action-Thought for GUI Agents" (Findings of EMNLP 2024)☆98Oct 14, 2024Updated last year
- ☆17Oct 30, 2023Updated 2 years ago
- [AAAI2025 Oral] BiDeV: Bilateral Defusing Verification for Complex Claim Fact-Checking☆12Apr 22, 2025Updated 9 months ago
- ☆10Apr 22, 2021Updated 4 years ago
- ☆20Oct 15, 2025Updated 4 months ago
- ☆12Apr 18, 2025Updated 9 months ago
- a Video Quality Analysis Toolkit☆13May 16, 2025Updated 8 months ago
- ☆10Dec 26, 2023Updated 2 years ago
- This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulin…☆13Jul 27, 2025Updated 6 months ago
- Code for Research Project TLDR☆25Jul 28, 2025Updated 6 months ago
- ☆13Mar 28, 2025Updated 10 months ago
- Office codebase for ICML 2025 paper "Core Knowledge Deficits in Multi-Modal Language Models"☆34Oct 1, 2025Updated 4 months ago
- This setup allows to train end-to-end neural models for spoken language understanding (SLU).☆11Jun 12, 2023Updated 2 years ago
- 清华大学校园网客户端与联网库,适用于命令行环境,Windows、Linux、Mac OS X桌面平台与UWP、iOS、Android移动平台☆12Mar 3, 2020Updated 5 years ago
- ☆12Mar 13, 2025Updated 11 months ago
- Repository for the paper "InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners"☆64Dec 4, 2025Updated 2 months ago
- Building a comprehensive and handy list of papers for GUI agents☆633Oct 27, 2025Updated 3 months ago
- Nex General Agentic Data Pipeline, an end-to-end pipeline for generating high-quality agentic training data.☆30Nov 19, 2025Updated 2 months ago
- This is the GitHub repository for Data Augmentation for Saliency Prediction via Latent Diffusion paper in ECCV 2024, Milano, Italy☆14Nov 7, 2024Updated last year
- [NeurIPS 2025]《SD-VLM: Spatial Measuring and Understanding with Depth-encoded Vision Language Models》☆33Dec 29, 2025Updated last month
- Code for "From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios"☆28Jul 7, 2025Updated 7 months ago
- ☆13Dec 9, 2024Updated last year
- Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning☆36Dec 22, 2025Updated last month
- ☆26Jan 23, 2026Updated 3 weeks ago
- Official implementation of Deep Factorized Metric Learning.☆20Jun 6, 2023Updated 2 years ago
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆58Nov 16, 2024Updated last year
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆130Nov 6, 2024Updated last year