The first attempt to replicate o3-like visual clue-tracking reasoning capabilities.
☆64Jul 8, 2025Updated 8 months ago
Alternatives and similar repositories for SeekWorld
Users that are interested in SeekWorld are comparing it to the libraries listed below
Sorting:
- [ICML 2024] GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model☆70Feb 1, 2026Updated last month
- ☆16Mar 17, 2025Updated 11 months ago
- Official implementation of the RSE paper mKGR.☆20Jan 15, 2026Updated last month
- More reliable Video Understanding Evaluation☆14Sep 23, 2025Updated 5 months ago
- Official implementation of the ICCV 2025 paper HoliTracer.☆42Jan 13, 2026Updated last month
- Official Github of "Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework"☆16Jan 4, 2026Updated 2 months ago
- [ISPRS P&RS'25] Official repository of the paper Cross-View Geo-Localization with Panoramic Street-View and VHR Satellite Imagery in Dece…☆20Nov 10, 2025Updated 3 months ago
- Research works from Tencent AI Lab regarding self-evolving agents☆83Jan 30, 2026Updated last month
- ☆23Apr 19, 2024Updated last year
- A collection of papers related to Geo-spatial Information Science in NeurIPS 2024.☆55Jan 5, 2025Updated last year
- Universal Video Temporal Grounding with Generative Multi-modal Large Language Models☆46Nov 25, 2025Updated 3 months ago
- ☆132Mar 22, 2025Updated 11 months ago
- [CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories☆92Aug 8, 2025Updated 7 months ago
- ☆31Feb 8, 2023Updated 3 years ago
- ☆35Nov 6, 2025Updated 4 months ago
- GroundCUA☆68Dec 24, 2025Updated 2 months ago
- 📚 A collection of resources and papers on Large Language Models in autonomous driving☆27Oct 30, 2023Updated 2 years ago
- Official implementation and datasets of AddressCLIP☆66Jul 4, 2024Updated last year
- 中文到表情☆31May 12, 2022Updated 3 years ago
- An open-source session replay tool for single-page applications that uses AI analysis, aggregated trends, and a RAG chatbot to help devel…☆11Jan 23, 2026Updated last month
- 《MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation》☆135Feb 2, 2026Updated last month
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆45Jul 22, 2025Updated 7 months ago
- Visual Spatial Tuning☆181Feb 19, 2026Updated 2 weeks ago
- [IJCV] PointOBB-v3: Expanding Performance Boundaries of Single Point-Supervised Oriented Object Detection☆40Sep 25, 2025Updated 5 months ago
- ☆36Jul 1, 2024Updated last year
- Simplifies data migration between Apache Ignite clusters by relying on Apache Avro as an intermediate storage format☆13Jun 27, 2023Updated 2 years ago
- A collection of papers related to Geo-spatial Information Science in CVPR 2025.☆39Apr 1, 2025Updated 11 months ago
- ☆10Sep 7, 2019Updated 6 years ago
- ☆10May 19, 2025Updated 9 months ago
- 是APEX贡献的一个基于大数据平台能力的数据开发平台,帮助企业以最小成本实现链接数据,构建和沉淀数仓模型,降低数据应用门槛,沉淀数据价值。☆12Oct 31, 2024Updated last year
- ☆34Jan 18, 2023Updated 3 years ago
- [ICLR 2026] The official implementation of the paper “Earth-Agent: Unlocking the Full Landscape of Earth Observation with Agents”☆99Feb 1, 2026Updated last month
- [ICCV'25] When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning☆47Feb 16, 2026Updated 3 weeks ago
- Close, But Not There: Boosting Geographic Distance Sensitivity in Visual Place Recognition☆42Dec 5, 2024Updated last year
- Learning Text-Enhanced Urban Region Profiling with Contrastive Language-Image Pre-Training☆42Apr 28, 2024Updated last year
- A simple exam generator and grader written in Python with OpenCV☆14Jan 14, 2026Updated last month
- Azure Machine Learning - MLOps Python SDKv2☆10Jul 24, 2023Updated 2 years ago
- build vgg16 with pytorch 0.4.0 for classification of CIFAR datasets☆10Mar 31, 2019Updated 6 years ago
- Collaborative Discourse Manager☆11Nov 6, 2016Updated 9 years ago