wentaozhu / AutoShot
AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection - CVPR NAS 2023
☆122Updated last year
Alternatives and similar repositories for AutoShot:
Users that are interested in AutoShot are comparing it to the libraries listed below
- Code for CVPR 2022 paper "Scene Consistency Representation Learning for Video Scene Segmentation"☆90Updated last year
- A simple script that reads a directory of videos, grabs a random frame, and automatically discovers a prompt for it☆133Updated 11 months ago
- [ECCV 2022] AutoTransition: Learning to Recommend Video Transition Effects☆59Updated 2 years ago
- official code for paper: Exploring Domain Incremental Video Highlights Detection with the LiveFood Benchmark☆34Updated last year
- ☆171Updated 6 months ago
- Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 …☆218Updated last year
- ☆120Updated last year
- [ICCV2023] UniVTG: Towards Unified Video-Language Temporal Grounding☆332Updated 8 months ago
- [ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models☆312Updated 7 months ago
- TransNet V2: Shot Boundary Detection Neural Network☆524Updated last year
- Official PyTorch implementation of the “Spatial-Semantic Collaborative Cropping for User Generated Content”. (AAAI24)☆52Updated 9 months ago
- [ICCV 2023, Official Code] for paper "Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspect…☆318Updated 5 months ago
- A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.☆106Updated 3 months ago
- Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Gr…☆122Updated 4 months ago
- Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…☆46Updated last year
- VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling☆261Updated this week
- ☆76Updated last month
- mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)☆221Updated last year
- [CVPR 2024] Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models☆227Updated last month
- ☆169Updated 6 months ago
- Implementation of Cross-category Video Highlight Detection via Set-based Learning (ICCV 2021).☆72Updated 3 years ago
- [CVPR2024] MotionEditor is the first diffusion-based model capable of video motion editing.☆153Updated 6 months ago
- Official repository for "PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout" (CVPR 2023).☆127Updated 6 months ago
- [NeurIPS 2024] VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models☆127Updated 3 months ago
- EILeV: Eliciting In-Context Learning in Vision-Language Models for Videos Through Curated Data Distributional Properties☆119Updated 2 months ago
- Supercharged BLIP-2 that can handle videos☆118Updated last year
- (CVPR 2024) Official code for paper "Towards Language-Driven Video Inpainting via Multimodal Large Language Models"☆83Updated 9 months ago
- Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).☆138Updated 2 years ago
- ☆236Updated 2 years ago
- ☆142Updated 6 months ago