v-iashin / video_featuresView external linksLinks
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
☆644Feb 1, 2026Updated last week
Alternatives and similar repositories for video_features
Users that are interested in video_features are comparing it to the libraries listed below
Sorting:
- Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)☆231Apr 8, 2023Updated 2 years ago
- Code for I3D Feature Extraction☆160Aug 7, 2019Updated 6 years ago
- ☆1,037Jun 28, 2020Updated 5 years ago
- PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)☆144Apr 8, 2023Updated 2 years ago
- Official code for "Learning Prompt-Enhanced Context features for Weakly-Supervised Video Anomlay Detection" (IEEE-TIP)☆102Aug 23, 2024Updated last year
- Code release for ActionFormer (ECCV 2022)☆539Apr 11, 2024Updated last year
- Easy to use video deep features extractor☆323Jul 5, 2020Updated 5 years ago
- I3D features extractor with resnet50 backbone☆75Aug 5, 2022Updated 3 years ago
- Temporal Action Detection & Weakly Supervised Temporal Action Detection & Temporal Action Proposal Generation☆570Jan 30, 2026Updated 2 weeks ago
- [CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos☆101Oct 30, 2022Updated 3 years ago
- Official repository of "TDSD: Text-Driven Scene-Decoupled Weakly Supervised Video Anomaly Detection"☆11May 25, 2025Updated 8 months ago
- Inflated i3d network with inception backbone, weights transfered from tensorflow☆548May 23, 2024Updated last year
- [ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning☆171Dec 4, 2020Updated 5 years ago
- An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"☆364Jul 25, 2024Updated last year
- The official implementation of G-TAD: Sub-Graph Localization for Temporal Action Detection☆221May 25, 2021Updated 4 years ago
- A curated list of temporal action localization/detection and related area (e.g. temporal action proposal) resources.☆587Sep 22, 2022Updated 3 years ago
- Feature Extractor module for videos using the PySlowFast framework☆80Apr 22, 2021Updated 4 years ago
- Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020☆127May 26, 2024Updated last year
- Real-world Anomaly Detection in Surveillance Videos CVPR2018 UCF-Crime dataset☆136Dec 23, 2021Updated 4 years ago
- Official code for AAAI2023 paper "Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection"☆87Jan 6, 2025Updated last year
- [CVPR 2024] Official code for paper: Prompt-Enhanced Multiple Instance Learning for Weakly Supervised Video Anomaly Detection.☆26Aug 19, 2024Updated last year
- TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks (ICCVW 2021)☆117Sep 16, 2023Updated 2 years ago
- OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark☆4,918Aug 14, 2024Updated last year
- I3D feature extractor☆43Dec 26, 2019Updated 6 years ago
- A visualization tool for temporal action localization (detection/segmentation).☆12Mar 30, 2023Updated 2 years ago
- An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"☆1,023Apr 12, 2024Updated last year
- Convolutional neural network model for video classification trained on the Kinetics dataset.☆1,819Sep 12, 2019Updated 6 years ago
- [CVPR2023] Code for the paper, TriDet: Temporal Action Detection with Relative Boundary Modeling☆205Dec 27, 2023Updated 2 years ago
- A curated publication list on weakly-supervised temporal action localization☆156Nov 27, 2023Updated 2 years ago
- [CVPR2022] Unsupervised Pre-training for Temporal Action Localization Tasks (UP-TAL)☆29Mar 9, 2022Updated 3 years ago
- This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP☆411Nov 14, 2022Updated 3 years ago
- ☆35May 24, 2019Updated 6 years ago
- S3D Text-Video model trained on HowTo100M using MIL-NCE☆200Jul 3, 2020Updated 5 years ago
- COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning☆291Sep 6, 2022Updated 3 years ago
- End-to-end Multi-modal Video Temporal Grounding, NeurIPS 2021☆18Oct 24, 2021Updated 4 years ago
- Implementation of "CLIP-TSA: CLIP-Assisted Temporal Self-Attention for Weakly-Supervised Video Anomaly Detection" (ICIP 2023)☆42Jul 13, 2024Updated last year
- OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.☆317Apr 29, 2025Updated 9 months ago
- This is an official implementation for "Video Swin Transformers".☆1,630Mar 8, 2023Updated 2 years ago
- I3D Models in PyTorch☆19Oct 23, 2020Updated 5 years ago