[AAAI 2026] SIFThinker: Spatially-Aware Image Focus for Visual Reasoning
☆23Dec 2, 2025Updated 3 months ago
Alternatives and similar repositories for SIFThinker
Users that are interested in SIFThinker are comparing it to the libraries listed below
Sorting:
- [SIGGRAPH2025] Generative Video Matting☆58Aug 12, 2025Updated 6 months ago
- A unified framework for controllable caption generation across images, videos, and audio. Supports multi-modal inputs and customizable ca…☆52Jul 24, 2025Updated 7 months ago
- FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding. (WACV2025)☆35Apr 17, 2025Updated 10 months ago
- The official implementation of our work Hawkeye: Discovering and Grounding Implicit Anomalous Sentiment in Recon-videos via Scene-enhanc…☆12Oct 14, 2024Updated last year
- Finetuning & extending DiffusionDet to video & pedestrian multi-object-tracking☆13Apr 12, 2023Updated 2 years ago
- ☆66Feb 23, 2026Updated last week
- The repository of VG-Refiner paper☆17Dec 9, 2025Updated 2 months ago
- Vision-Language Models Toolbox: Your all-in-one solution for multimodal research and experimentation☆12Feb 16, 2025Updated last year
- SurgLaVi: Official repository☆27Updated this week
- (ICCV2025) Official repository of paper "ViSpeak: Visual Instruction Feedback in Streaming Videos"☆46Jul 1, 2025Updated 8 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆12Nov 14, 2025Updated 3 months ago
- A simple exam generator and grader written in Python with OpenCV☆14Jan 14, 2026Updated last month
- Official implementation of CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation☆12Dec 5, 2025Updated 3 months ago
- ☆10Apr 7, 2025Updated 11 months ago
- ☆22Dec 11, 2025Updated 2 months ago
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- ☆11Jan 18, 2025Updated last year
- DisTime: Distribution-based Time Representation for Video Large Language Models.☆19Jul 10, 2025Updated 7 months ago
- Building a multi-agent RAG system with advanced RAG methods☆12Jan 12, 2025Updated last year
- SpringBoot和VUE的前后端分离开发入门项目---车辆管理系统前端☆12Dec 12, 2022Updated 3 years ago
- ☆23Jun 19, 2025Updated 8 months ago
- Code of Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation, WACV 2025☆10Dec 5, 2024Updated last year
- [AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…☆64Jan 27, 2026Updated last month
- ☆18Jul 31, 2025Updated 7 months ago
- Official Implementation for ACM MM2024 paper "VrdONE: One-stage Video Visual Relation Detection".☆11Nov 13, 2024Updated last year
- ☆13May 17, 2025Updated 9 months ago
- Official repository of paper "LOVE-R1: Advancing Long Video Understanding with Adaptive Zoom-in Mechanism via Multi-Step Reasoning"☆22Nov 1, 2025Updated 4 months ago
- Code for AAAI2024 paper: Towards Evidential and Class Separable Open Set Object Detection☆12Dec 23, 2023Updated 2 years ago
- Official Code for "Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning" (ICLR 2025)☆12Mar 6, 2025Updated last year
- The official repository of UVOSAM☆13Jun 5, 2024Updated last year
- VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection☆25May 31, 2025Updated 9 months ago
- ☆28Jan 5, 2026Updated 2 months ago
- ☆24Feb 17, 2026Updated 2 weeks ago
- Official training code for MUG-V 10B video generation model. Built on Megatron-LM (v0.14.0) with production-ready distributed training fo…☆19Oct 20, 2025Updated 4 months ago
- Image Text Segmentation using FAST corner detection and DBSCAN clustering with k-d tree data structure☆14Feb 27, 2019Updated 7 years ago
- mouse pet-ct image segmentation☆12Feb 19, 2023Updated 3 years ago
- ☆25Feb 12, 2026Updated 3 weeks ago
- [AAAI 2026] Official Code for VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning☆19Nov 28, 2025Updated 3 months ago
- Remote sensing labwork☆12Feb 27, 2018Updated 8 years ago