zhangquanchen / SIFThinkerView external linksLinks
[AAAI 2026] SIFThinker: Spatially-Aware Image Focus for Visual Reasoning
☆22Dec 2, 2025Updated 2 months ago
Alternatives and similar repositories for SIFThinker
Users that are interested in SIFThinker are comparing it to the libraries listed below
Sorting:
- A collection of research papers on hypervisor testing.☆27Jan 31, 2026Updated 2 weeks ago
- [SIGGRAPH2025] Generative Video Matting☆57Aug 12, 2025Updated 6 months ago
- your finance bro Agent for trading and investing☆108Nov 8, 2025Updated 3 months ago
- FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding. (WACV2025)☆34Apr 17, 2025Updated 9 months ago
- The official implementation of our work Hawkeye: Discovering and Grounding Implicit Anomalous Sentiment in Recon-videos via Scene-enhanc…☆12Oct 14, 2024Updated last year
- 🐱 PawHaven — an open-source platform that helps volunteers, shelters, and adopters report, track, and share stray animal rescue cases (f…☆88Updated this week
- Vision-Language Models Toolbox: Your all-in-one solution for multimodal research and experimentation☆12Feb 16, 2025Updated 11 months ago
- DisTime: Distribution-based Time Representation for Video Large Language Models.☆18Jul 10, 2025Updated 7 months ago
- Finetuning & extending DiffusionDet to video & pedestrian multi-object-tracking☆13Apr 12, 2023Updated 2 years ago
- ☆65Jan 7, 2026Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆12Nov 14, 2025Updated 3 months ago
- (ICCV2025) Official repository of paper "ViSpeak: Visual Instruction Feedback in Streaming Videos"☆45Jul 1, 2025Updated 7 months ago
- SurgLaVi: Large-Scale Hierarchical Datasets for Surgical Vision–Language Representation Learning☆23Feb 2, 2026Updated last week
- Code of Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation, WACV 2025☆10Dec 5, 2024Updated last year
- ☆22Dec 11, 2025Updated 2 months ago
- Official implementation of CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation☆12Dec 5, 2025Updated 2 months ago
- A simple exam generator and grader written in Python with OpenCV☆14Jan 14, 2026Updated last month
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- Building a multi-agent RAG system with advanced RAG methods☆12Jan 12, 2025Updated last year
- ☆10Apr 7, 2025Updated 10 months ago
- ☆11Jan 18, 2025Updated last year
- SpringBoot和VUE的前后端分离开发入门项目---车辆管理系统前端☆12Dec 12, 2022Updated 3 years ago
- Quick Long Video Understanding [TMLR2025]☆75Oct 27, 2025Updated 3 months ago
- [AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…☆64Jan 27, 2026Updated 2 weeks ago
- ☆13Jan 21, 2025Updated last year
- Surrogate Modeling of the Aerodynamic Performance for Transonic Regime☆13Feb 12, 2024Updated 2 years ago
- Official Implementation for ACM MM2024 paper "VrdONE: One-stage Video Visual Relation Detection".☆11Nov 13, 2024Updated last year
- Pytorch Implementation of Our NAACL 2021 Paper "Incorporating Syntax and Semantics in Coreference Resolution with Heterogeneous Graph Att…☆10Apr 28, 2022Updated 3 years ago
- Official training code for MUG-V 10B video generation model. Built on Megatron-LM (v0.14.0) with production-ready distributed training fo…☆19Oct 20, 2025Updated 3 months ago
- ☆11Jul 3, 2023Updated 2 years ago
- MMM 2021: Crossed-Time Delay Neural Network for Speaker Recognition☆11Dec 4, 2021Updated 4 years ago
- [CVPR 2024] Official repository of ST_GT☆10Sep 15, 2024Updated last year
- Remote sensing labwork☆12Feb 27, 2018Updated 7 years ago
- Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER" @COLING-2022☆11Aug 20, 2022Updated 3 years ago
- ☆13May 17, 2025Updated 8 months ago
- mouse pet-ct image segmentation☆12Feb 19, 2023Updated 2 years ago
- VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection☆22May 31, 2025Updated 8 months ago
- ☆24Nov 27, 2025Updated 2 months ago
- [AAAI 2026] Official Code for VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning☆19Nov 28, 2025Updated 2 months ago