[CVPR 2026 Highlight] VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding
☆116Apr 17, 2026Updated 2 weeks ago
Alternatives and similar repositories for VideoITG
Users that are interested in VideoITG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code release for "MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos"(CVPR2023)☆14Dec 14, 2023Updated 2 years ago
- SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D Scene Editing☆19Dec 28, 2024Updated last year
- Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis (CVPR 2023)☆18Dec 13, 2024Updated last year
- ECCV2024, LAPT: Label-driven Automated Prompt Tuning for OOD Detection with Vision-Language Models☆18Aug 9, 2024Updated last year
- Official code repository for "Self-transcendence: Is External Feature Guidance Indispensable for Accelerating Diffusion Transformer Train…☆32Mar 17, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ICLR 2026] FOCUS: Efficient Keyframe Selection for Long Video Understanding☆65Apr 23, 2026Updated last week
- FPR: False Positive Rectification for Weakly Supervised Semantic Segmentation (ICCV 2023)☆24Sep 24, 2023Updated 2 years ago
- SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality☆34Nov 25, 2024Updated last year
- 📚 A curated collection of papers and open-source code repositories dedicated to the application of Vision-Language Models (VLMs) for str…☆146Apr 13, 2026Updated 2 weeks ago
- [ECCV'24] A novel weakly supervised framework for 3D object detection from 2D bounding boxes. It can easily extend to novel scenarios and…☆36Jul 26, 2024Updated last year
- Official code for our Paper "SSL: A Self-similarity Loss for Improving Generative Image Super-resolution" in ACMMM 2024☆50Jun 1, 2025Updated 11 months ago
- Official PyTorch codes for "Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation", ECCV2024☆31Jul 19, 2024Updated last year
- Official code of DMA: Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding, ECCV 2024☆32Jul 18, 2024Updated last year
- Official pytorch implementation of DynaMask: Dynamic Mask Selection for Instance Segmentation (CVPR 2023)☆11Feb 28, 2024Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Official code for our CVPR 2025 paper: "Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption"☆67Sep 15, 2025Updated 7 months ago
- Official PyTorch implementation of paper “InsViE-1M: Effective Instruction-based Video Editing with Elaborate Dataset Construction”☆33Apr 3, 2026Updated 3 weeks ago
- The public source code of "FreCaS: Efficient Higher-Resolution Image Generation via Frequency-aware Cascaded Sampling"☆32Jul 7, 2025Updated 9 months ago
- ☆25Mar 30, 2025Updated last year
- Toward Generalizing Visual Brain Decoding to Unseen Subjects☆28May 14, 2025Updated 11 months ago
- Flexible Image Reflection Removal with Sparse Human Guidance☆12Jul 7, 2025Updated 9 months ago
- ☆12Jul 18, 2024Updated last year
- Project page of "GaussianSR: 3D Gaussian Super-Resolution with 2D Diffusion Priors"☆23Jul 1, 2024Updated last year
- ☆16Nov 12, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for "SePPO: Semi-Policy Preference Optimization for Diffusion Alignment."☆18Oct 7, 2024Updated last year
- Official pytorch implementation of SIM: Semantic-aware Instance Mask Generation for Box-Supervised Instance Segmentation (CVPR 2023)☆38Jul 31, 2023Updated 2 years ago
- [ICCV 2025] Implementation of the paper "Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs"☆74Oct 25, 2025Updated 6 months ago
- [ICCV 2023] StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection☆798Jun 26, 2024Updated last year
- [CVPR2025] Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data☆68Apr 24, 2025Updated last year
- Camera streamer on jetson Xavier☆15Feb 18, 2021Updated 5 years ago
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆33Dec 22, 2025Updated 4 months ago
- Code release for "BoxVIS: Video Instance Segmentation with Box Annotation"☆12Dec 22, 2023Updated 2 years ago
- The official codes of our CVPR-2023 paper: Sharpness-Aware Gradient Matching for Domain Generalization☆80May 31, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The official code for paper "GPSToken: Gaussian Parameterized Spatially-adaptive Tokenization for Image Representation and Generation"☆53Apr 3, 2026Updated 3 weeks ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆237Aug 18, 2025Updated 8 months ago
- ☆29Apr 8, 2025Updated last year
- [CVPR2023] This is an official implementation of paper "DETRs with Hybrid Matching".☆28Sep 1, 2022Updated 3 years ago
- 旷视AI智慧交通开源赛道交通标志检测,冠军方案☆38Sep 8, 2022Updated 3 years ago
- Code for ECCV 2024 Paper "Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution"☆161Dec 2, 2025Updated 5 months ago
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆23Aug 1, 2025Updated 9 months ago