[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences
☆43Mar 11, 2025Updated last year
Alternatives and similar repositories for MovieSeq
Users that are interested in MovieSeq are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository of GUI Action Narrator☆13Apr 8, 2025Updated 11 months ago
- [ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models☆24Jan 1, 2026Updated 2 months ago
- 👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)☆74Jan 20, 2025Updated last year
- Edit and Generate Anything in 3D world!☆14Apr 15, 2023Updated 2 years ago
- T2VScore: Towards A Better Metric for Text-to-Video Generation☆81Apr 10, 2024Updated last year
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- ☆28Apr 8, 2025Updated 11 months ago
- Code for [CVPR 2025] ROICtrl: Boosting Instance Control for Visual Generation☆110Apr 16, 2025Updated 11 months ago
- ☆12Sep 15, 2024Updated last year
- ☆27May 13, 2025Updated 10 months ago
- [CVPR 2025] A Hierarchical Movie Level Dataset for Long Video Generation☆97Mar 16, 2025Updated last year
- LLMBind: A Unified Modality-Task Integration Framework☆19Jun 16, 2024Updated last year
- TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models☆38Nov 10, 2024Updated last year
- FreeVA: Offline MLLM as Training-Free Video Assistant☆69Jun 9, 2024Updated last year
- [ECCV 2022] GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval☆17Aug 24, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆17Oct 10, 2023Updated 2 years ago
- [CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs☆117Mar 12, 2026Updated 2 weeks ago
- DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles☆32Mar 8, 2026Updated 3 weeks ago
- [ICML 2025] This is the official PyTorch implementation of "OmniBal: Towards Fast Instruction-Tuning for Vision-Language Models via Omniv…☆27Jun 16, 2025Updated 9 months ago
- [NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding☆83Dec 14, 2025Updated 3 months ago
- Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.☆115Jul 27, 2025Updated 8 months ago
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)☆32May 15, 2023Updated 2 years ago
- Official implementation of "A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives", accepted at CVPR 2…☆24Jun 13, 2024Updated last year
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆117Sep 15, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official PyTorch code of GroundVQA (CVPR'24)☆64Sep 13, 2024Updated last year
- PyTorch code for the CVPR'23 paper: "ConStruct-VL: Data-Free Continual Structured VL Concepts Learning"☆14Feb 5, 2024Updated 2 years ago
- 🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant (NeurIPS 2024)☆121Mar 26, 2025Updated last year
- DisTime: Distribution-based Time Representation for Video Large Language Models.☆20Jul 10, 2025Updated 8 months ago
- ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model☆16Jan 31, 2024Updated 2 years ago
- [ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision☆72Jul 10, 2024Updated last year
- Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos☆46Apr 29, 2024Updated last year
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- [CVPR 2024 Champions][ICLR 2025] Solutions for EgoVis Chanllenges in CVPR 2024☆133May 11, 2025Updated 10 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [ECCV'22 Poster] Explicit Image Caption Editing☆22Nov 30, 2022Updated 3 years ago
- (NeurIPS 2023) Open-set visual object query search & localization in long-form videos☆26Feb 1, 2024Updated 2 years ago
- [ICCV 2023] Label-Efficient Online Continual Object Detection in Streaming Video☆23Jan 8, 2024Updated 2 years ago
- Modality-Invariant Temporal Representation Learning☆22Apr 21, 2023Updated 2 years ago
- This is the project page of ShowRoom3D☆26Dec 22, 2023Updated 2 years ago
- Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"☆158Jun 23, 2025Updated 9 months ago
- ☆45Aug 14, 2023Updated 2 years ago