[EMNLP 2025 Industry] Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning
☆36Oct 22, 2025Updated 4 months ago
Alternatives and similar repositories for TVG-R1
Users that are interested in TVG-R1 are comparing it to the libraries listed below
Sorting:
- ☆18Jun 10, 2025Updated 8 months ago
- [AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…☆64Jan 27, 2026Updated last month
- [AAAI 2024] GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval☆20May 10, 2024Updated last year
- ICCV'23 Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval☆19Aug 22, 2025Updated 6 months ago
- [NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding☆79Dec 14, 2025Updated 2 months ago
- ☆26Jan 4, 2025Updated last year
- Pytorch implementation of the paper 'Gaussian Mixture Proposals with Pull-Push Learning Scheme to Capture Diverse Events for Weakly Super…☆19Jan 19, 2024Updated 2 years ago
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)☆55Sep 7, 2023Updated 2 years ago
- Official Implementation (Pytorch) of the "VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Capti…☆23Jan 26, 2025Updated last year
- Universal Video Temporal Grounding with Generative Multi-modal Large Language Models☆46Nov 25, 2025Updated 3 months ago
- Offical implemention of the paper DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction☆29May 26, 2024Updated last year
- [EMNLP 2025 Findings] Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models☆138Aug 21, 2025Updated 6 months ago
- EventHallusion: Diagnosing Event Hallucinations in Video LLMs☆34Aug 5, 2025Updated 6 months ago
- Latest Papers, Codes and Datasets on VTG-LLMs.☆83Nov 17, 2025Updated 3 months ago
- [ICLR 2025] IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆37Nov 27, 2024Updated last year
- build vgg16 with pytorch 0.4.0 for classification of CIFAR datasets☆10Mar 31, 2019Updated 6 years ago
- Combined InstantID🔥 and FouriScale to generate high resolution image!☆11Apr 3, 2024Updated last year
- [AAAI 2024] DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval.☆47Oct 14, 2024Updated last year
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Sep 21, 2025Updated 5 months ago
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆20Aug 1, 2025Updated 7 months ago
- Code for MME-SID accepted to CIKM 2025 Full Research track.☆27Oct 29, 2025Updated 4 months ago
- ☆13Jun 11, 2024Updated last year
- quagga☆10Apr 7, 2020Updated 5 years ago
- A curated list of awesome resources for salient object ranking (SOR)☆15Sep 28, 2025Updated 5 months ago
- UMB: Understanding Model Behavior for Open-World object Detection (NeurIPS 2024)☆11May 26, 2024Updated last year
- Code of the paper https://arxiv.org/abs/2009.11939. A defocus blur estimation method.☆10Jan 13, 2022Updated 4 years ago
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆19Nov 4, 2025Updated 3 months ago
- ☆16Oct 9, 2024Updated last year
- The official source code of our AAAI25 paper "D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matchin…☆10Feb 9, 2025Updated last year
- ☆14Dec 25, 2024Updated last year
- ☆18Aug 7, 2025Updated 6 months ago
- ICME'19: Removing Rain in Videos: A Large-scale Database and A Two-stream ConvLSTM Approach☆12Jul 4, 2022Updated 3 years ago
- [2021 MultiMedia] CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval☆42Sep 23, 2021Updated 4 years ago
- ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models☆91Sep 12, 2025Updated 5 months ago
- F-16 is a powerful video large language model (LLM) that perceives high-frame-rate videos, which is developed by the Department of Electr…☆34Jul 3, 2025Updated 8 months ago
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- Reward Estimation for Variance Reduction in Deep Reinforcement Learning☆10May 8, 2018Updated 7 years ago
- Official code of *Towards Event-oriented Long Video Understanding*☆12Jul 26, 2024Updated last year
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"☆19Jan 18, 2026Updated last month