CVPR25
☆26Jul 2, 2025Updated 7 months ago
Alternatives and similar repositories for MP-GUI
Users that are interested in MP-GUI are comparing it to the libraries listed below
Sorting:
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models☆21Jan 29, 2025Updated last year
- ☆23Jul 8, 2023Updated 2 years ago
- Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"☆28Jul 31, 2024Updated last year
- ☆38Feb 8, 2024Updated 2 years ago
- [Awesome] 🔥🔥🔥 Latest Papers, Codes and Datasets on Streaming / Online Video Understanding☆112Jan 13, 2026Updated last month
- [CVPR 2026] Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"☆82Feb 13, 2026Updated 2 weeks ago
- Self-similarity Prior Distillation for Unsupervised Remote Physiological Measurement☆10Oct 18, 2024Updated last year
- ☆13Dec 18, 2024Updated last year
- ☆13Jul 3, 2024Updated last year
- ☆29Feb 13, 2026Updated 2 weeks ago
- ☆16Jan 23, 2026Updated last month
- This is a project on visual spatial reasoning tasks-SIBench☆25Jan 12, 2026Updated last month
- [ECCV 2024] FlexAttention for Efficient High-Resolution Vision-Language Models☆46Jan 8, 2025Updated last year
- ☆31Sep 19, 2025Updated 5 months ago
- ☆10Nov 9, 2023Updated 2 years ago
- [2022.05.16 ~ 2022.06.10] 🌤️미세먼지 없는 맑은 사진📷 - 부스트캠프 AI Tech 3기 최종 프로젝트☆14Jun 11, 2022Updated 3 years ago
- study for python, ML, DL, Data Science, etc☆15Feb 6, 2026Updated 3 weeks ago
- Owl Eyes: Spotting UI Display Issues via Visual Understanding☆11Jul 31, 2020Updated 5 years ago
- EgoToM is an egocentric theory-of-mind benchmark built on Ego4D videos, containing multi-choice questions that evaluate multimodal large …☆13Apr 1, 2025Updated 10 months ago
- cliptrase☆47Sep 1, 2024Updated last year
- ☆14Sep 11, 2025Updated 5 months ago
- ☆10Dec 26, 2023Updated 2 years ago
- [CVPR 2025] GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration☆20Mar 21, 2025Updated 11 months ago
- Code for paper: Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation☆12Aug 3, 2024Updated last year
- ☆20Nov 21, 2025Updated 3 months ago
- Exposure-slot: Exposure-centric representations learning with Slot-in-Slot Attention for Region-aware Exposure Correction, Computer Visi…☆21Sep 2, 2025Updated 5 months ago
- ☆11Oct 17, 2024Updated last year
- ☆13May 15, 2025Updated 9 months ago
- ☆10Sep 25, 2024Updated last year
- Corpus to accompany: "Selective Vision is the Challenge for Visual Reasoning: A Benchmark for Visual Argument Understanding"☆11Apr 11, 2025Updated 10 months ago
- ☆61Dec 5, 2025Updated 2 months ago
- [CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection☆137Jul 28, 2025Updated 7 months ago
- ☆51May 11, 2025Updated 9 months ago
- PyTorch code for the CVPR'23 paper: "ConStruct-VL: Data-Free Continual Structured VL Concepts Learning"☆14Feb 5, 2024Updated 2 years ago
- https://github.com/jzhang38/TinyLlama using only PyTorch☆13Jan 24, 2024Updated 2 years ago
- ☆12Jul 16, 2025Updated 7 months ago
- [ICLR 2025] No Preference Left Behind: Group Distributional Preference Optimization☆14Apr 21, 2025Updated 10 months ago
- ☆12Jul 16, 2024Updated last year
- ☆31Feb 18, 2026Updated last week