EasonXiao-888 / UVCOM
[CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection
β90Updated 10 months ago
Alternatives and similar repositories for UVCOM
Users that are interested in UVCOM are comparing it to the libraries listed below
Sorting:
- π R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding (ECCV 2024)β82Updated 10 months ago
- Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grβ¦β131Updated 8 months ago
- Official Implementation of "Chrono: A Simple Blueprint for Representing Time in MLLMs"β87Updated 2 months ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Groundingβ54Updated 10 months ago
- A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Abilityβ94Updated 5 months ago
- [CVPR2025] Number it: Temporal Grounding Videos like Flipping Mangaβ79Updated last month
- Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Modelsβ106Updated last month
- [CVPR 2025] Online Video Understanding: OVBench and VideoChat-Onlineβ35Updated last month
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)β50Updated last year
- [ICLR 2025] TRACE: Temporal Grounding Video LLM via Casual Event Modelingβ94Updated 3 months ago
- [CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selectionβ75Updated last month
- Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"β111Updated 2 months ago
- UniMD: Towards Unifying Moment retrieval and temporal action Detectionβ47Updated 10 months ago
- β97Updated 9 months ago
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)β65Updated 11 months ago
- β71Updated 5 months ago
- Official PyTorch code of GroundVQA (CVPR'24)β61Updated 8 months ago
- Official pytorch repository for "TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection" (AAAI 2024 Papeβ¦β47Updated 2 months ago
- VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuningβ121Updated last week
- Official Implementation of SnAG (CVPR 2024)β47Updated 2 weeks ago
- [ECCVβ24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarioβ¦β52Updated 8 months ago
- [ICLR 2025] TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuningβ33Updated last month
- β133Updated 7 months ago
- [AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Groundingβ100Updated 5 months ago
- [2023 ACL] CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Groundingβ31Updated last year
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videosβ118Updated 4 months ago
- [CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?β56Updated last month
- FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding. (WACV2025)β21Updated last month
- Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)β100Updated 3 months ago
- Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 β¦β228Updated last year