HengLan/TA-STVG

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HengLan/TA-STVG)

HengLan / TA-STVG

[ICLR 2025] Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding

☆44

Alternatives and similar repositories for TA-STVG

Users that are interested in TA-STVG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

HengLan / CGSTVG
View on GitHub
[CVPR 2024] Context-Guided Spatio-Temporal Video Grounding
☆66Jun 28, 2024Updated 2 years ago
jy0205 / STCAT
View on GitHub
[NeurIPS 2022] Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
☆54Mar 5, 2024Updated 2 years ago
Zhuo-Cao / FlashVTG
View on GitHub
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding. (WACV2025)
☆39Apr 17, 2025Updated last year
HengLan / Awesome-Visual-Tracking
View on GitHub
Awesome Visual Tracking
☆24Oct 3, 2025Updated 9 months ago
minghangz / SPL
View on GitHub
Generating Structured Pseudo Labels for Noise-resistant Zero-shot Video Sentence Localization
☆16Jul 20, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zaiquanyang / LLaVA_Next_STVG
View on GitHub
LLaVA-Next for STVG
☆21Dec 5, 2025Updated 7 months ago
GX77 / LCVSL
View on GitHub
☆14Sep 28, 2023Updated 2 years ago
houzhijian / CONE
View on GitHub
[2023 ACL] CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
☆31Aug 5, 2023Updated 2 years ago
baopj / Vid-Morp
View on GitHub
☆12Dec 6, 2024Updated last year
baopj / E3M
View on GitHub
[ECCV 2024] The first zero-shot setting for spatio-temporal video grounding.
☆11Jul 16, 2024Updated 2 years ago
minghangz / cnm
View on GitHub
Weakly Supervised Video Moment Localisation with Contrastive Negative Sample Mining
☆31Apr 4, 2022Updated 4 years ago
Jayce1kk / SpaceVLLM
View on GitHub
SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
☆17May 8, 2025Updated last year
V-STaR-Bench / V-STaR
View on GitHub
Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
☆45Mar 2, 2026Updated 4 months ago
Soldelli / VLG-Net
View on GitHub
VLG-Net: Video-Language Graph Matching Networks for Video Grounding
☆31May 31, 2022Updated 4 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
nakaizura / Awesome-Cross-Modal-Video-Moment-Retrieval
View on GitHub
前沿论文持续更新--视频时刻定位 or 时域语言定位 or 视频片段检索。
☆14Nov 8, 2023Updated 2 years ago
tzhhhh123 / HC-STVG
View on GitHub
The HC-STVG Dataset
☆65Apr 12, 2023Updated 3 years ago
suhwan-cho / TransFlow
View on GitHub
[ICCVW 2025] TransFlow: Motion Knowledge Transfer from Video Diffusion Models to Video Salient Object Detection
☆15Oct 22, 2025Updated 9 months ago
NeverMoreLCH / Awesome-Video-Grounding
View on GitHub
A reading list of papers about Visual Grounding.
☆31Aug 24, 2022Updated 3 years ago
skhcjh231 / MATR_codebase
View on GitHub
☆22Mar 7, 2025Updated last year
Hydragon516 / FFF-VDI
View on GitHub
[AAAI 2025] Video Diffusion Models are Strong Video Inpainter
☆17Jul 21, 2025Updated last year
gyxxyg / TRACE
View on GitHub
[ICLR 2025] TRACE: Temporal Grounding Video LLM via Casual Event Modeling
☆156Aug 22, 2025Updated 11 months ago
Tanveer81 / ReVisionLLM
View on GitHub
This is the official implementation of ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos
☆47Nov 5, 2025Updated 8 months ago
yongliang-wu / NumPro
View on GitHub
[CVPR2025] Number it: Temporal Grounding Videos like Flipping Manga
☆150Jan 19, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
xiaomi-research / time-r1
View on GitHub
[NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding
☆95Dec 14, 2025Updated 7 months ago
minjoong507 / BM-DETR
View on GitHub
[WACV 2025] Official Pytorch code for "Background-aware Moment Detection for Video Moment Retrieval"
☆16Feb 24, 2025Updated last year
yolish / camera-pose-auto-encoders
View on GitHub
☆21Aug 12, 2025Updated 11 months ago
wangbo-zhao / 2022CVPR-MMMMTBVS
View on GitHub
This is the code for CVPR2022 paper "Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation"
☆19Feb 19, 2023Updated 3 years ago
lntzm / MESM
View on GitHub
The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)
☆32Mar 29, 2024Updated 2 years ago
SkyworkAI / DAQ-VS
View on GitHub
Code For Our Work: DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries [ECCV-2024]
☆15Jul 11, 2024Updated 2 years ago
WHB139426 / Grounded-Video-LLM
View on GitHub
[EMNLP 2025 Findings] Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
☆149Aug 21, 2025Updated 11 months ago
SungHunYang / MonoCLUE
View on GitHub
[ AAAI 2026 ] The official implementation of 'MonoCLUE: Object-Aware Clustering Enhances Monocular 3D Object Detection'
☆21Mar 23, 2026Updated 4 months ago
Jho-Yonsei / STC-Net
View on GitHub
[ICCV 2023] Leveraging Spatio-Temporal Dependency for Skeleton-Based Action Recognition
☆21Jun 24, 2024Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
doc-doc / NExT-GQA
View on GitHub
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
☆89Jul 1, 2024Updated 2 years ago
EasonXiao-888 / UVCOM
View on GitHub
[CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection
☆117Jul 17, 2024Updated 2 years ago
mbzuai-oryx / VideoMolmo
View on GitHub
Official code of the paper "VideoMolmo: Spatio-Temporal Grounding meets Pointing"
☆56Jul 5, 2025Updated last year
HengLan / SMOT
View on GitHub
[ECCV 2024] Beyond MOT: Semantic Multi-Object Tracking
☆31Sep 12, 2024Updated last year
lxa9867 / QSD
View on GitHub
[CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"
☆12Feb 27, 2024Updated 2 years ago
zjucsq / PLA
View on GitHub
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆12Sep 17, 2023Updated 2 years ago
reckdk / SkelCompletion-3D
View on GitHub
This is an official implementation of the MICCAI2024 paper "Self-supervised 3D Skeleton Completion for Vascular Structures".
☆17Jun 10, 2025Updated last year