showlab / MovieSeqView external linksLinks
[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences
☆42Mar 11, 2025Updated 11 months ago
Alternatives and similar repositories for MovieSeq
Users that are interested in MovieSeq are comparing it to the libraries listed below
Sorting:
- 👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)☆74Jan 20, 2025Updated last year
- ☆28Apr 8, 2025Updated 10 months ago
- [ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models☆25Jan 1, 2026Updated last month
- ☆24May 13, 2025Updated 9 months ago
- TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models☆37Nov 10, 2024Updated last year
- DisTime: Distribution-based Time Representation for Video Large Language Models.☆18Jul 10, 2025Updated 7 months ago
- [NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"☆31Nov 15, 2025Updated 3 months ago
- A repo for generating random NFTs with metadata 100% on chain!☆37Mar 8, 2024Updated last year
- TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs☆103Feb 2, 2026Updated last week
- LLMBind: A Unified Modality-Task Integration Framework☆19Jun 16, 2024Updated last year
- FreeVA: Offline MLLM as Training-Free Video Assistant☆68Jun 9, 2024Updated last year
- [ECCV'22 Poster] Explicit Image Caption Editing☆22Nov 30, 2022Updated 3 years ago
- [CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval☆22Jun 23, 2025Updated 7 months ago
- ☆45Aug 14, 2023Updated 2 years ago
- Official PyTorch code of GroundVQA (CVPR'24)☆64Sep 13, 2024Updated last year
- [ICML 2025] This is the official PyTorch implementation of "OmniBal: Towards Fast Instruction-Tuning for Vision-Language Models via Omniv…☆27Jun 16, 2025Updated 8 months ago
- [ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision☆72Jul 10, 2024Updated last year
- [CVPR 2025] A Hierarchical Movie Level Dataset for Long Video Generation☆87Mar 16, 2025Updated 10 months ago
- Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs☆26Jan 14, 2025Updated last year
- Official implementation of "A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives", accepted at CVPR 2…☆24Jun 13, 2024Updated last year
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"☆52Dec 5, 2024Updated last year
- [NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding☆75Dec 14, 2025Updated 2 months ago
- [ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"☆37Oct 9, 2025Updated 4 months ago
- TStar is a unified temporal search framework for long-form video question answering☆86Sep 2, 2025Updated 5 months ago
- [CVPR 2025] DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles☆29May 13, 2025Updated 9 months ago
- 🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant (NeurIPS 2024)☆118Mar 26, 2025Updated 10 months ago
- Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos☆46Apr 29, 2024Updated last year
- Code for [CVPR 2025] ROICtrl: Boosting Instance Control for Visual Generation☆110Apr 16, 2025Updated 9 months ago
- Official Implementation of Video-MA2MBA☆12Dec 3, 2024Updated last year
- Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Finding]"☆15Aug 27, 2025Updated 5 months ago
- ☆13Jul 10, 2024Updated last year
- PyTorch code for the CVPR'23 paper: "ConStruct-VL: Data-Free Continual Structured VL Concepts Learning"☆14Feb 5, 2024Updated 2 years ago
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- ☆11Oct 2, 2024Updated last year
- This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"☆13Aug 22, 2025Updated 5 months ago
- ☆46Dec 30, 2024Updated last year
- ☆155Oct 31, 2024Updated last year
- T2VScore: Towards A Better Metric for Text-to-Video Generation☆80Apr 10, 2024Updated last year
- [ICCV 2025] Dynamic-VLM☆28Dec 16, 2024Updated last year