keplerlab / katnaLinks
Tool for automating common video key-frame extraction, video compression and Image Auto-crop/Image-resize tasks
☆357Updated 10 months ago
Alternatives and similar repositories for katna
Users that are interested in katna are comparing it to the libraries listed below
Sorting:
- It is a simple python tool to extract key-frames from a video file using peak estimation from frame difference.☆161Updated 2 weeks ago
- TransNet V2: Shot Boundary Detection Neural Network☆638Updated last year
- Codebase for CVPR2020 A Local-to-Global Approach to Multi-modal Movie Scene Segmentation☆229Updated last year
- Tools for movie and video research☆289Updated 2 years ago
- ☆134Updated last year
- AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection - CVPR NAS 2023☆162Updated 2 years ago
- ☆244Updated 2 years ago
- [NeurIPS 2021] Moment-DETR code and QVHighlights dataset☆307Updated last year
- Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]☆360Updated 3 years ago
- Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)☆227Updated 2 years ago
- Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]☆178Updated 2 years ago
- COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning☆289Updated 2 years ago
- Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and T…☆596Updated 4 months ago
- Video to Text: Natural language description generator for some given video. [Video Captioning]☆345Updated 3 years ago
- Experimenting with different Summarizing techniques on SumMe Dataset☆138Updated 4 years ago
- [ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding☆354Updated last year
- [NIPS2023] Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset☆278Updated last year
- Code for the HowTo100M paper☆270Updated 5 years ago
- PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)☆242Updated 2 years ago
- Using Color Histogram, SVD and Dynamic Clustering Method obtained Key-Frames from a video. This analysis can be used to identify frames w…☆22Updated 4 years ago
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆187Updated last month
- This repository contains script to divide a video into key frames.☆169Updated 7 years ago
- DSNet: A Flexible Detect-to-Summarize Network for Video Summarization☆216Updated 3 years ago
- GIT: A Generative Image-to-text Transformer for Vision and Language☆567Updated last year
- An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"☆356Updated 10 months ago
- Video Copy Segment Localization (VCSL) dataset and benchmark [CVPR2022]☆125Updated last year
- UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or …☆217Updated last year
- [ECCV 2022] AutoTransition: Learning to Recommend Video Transition Effects☆62Updated 3 months ago
- Large-scale text-video dataset. 10 million captioned short videos.☆639Updated 9 months ago
- Search photos on Unsplash based on OpenAI's CLIP model, support search with joint image+text queries and attention visualization.☆222Updated 3 years ago