alvinbhou / Video2TextLinks
📺 An Encoder-Decoder Model for Sequence-to-Sequence learning: Video to Text
☆25Updated 7 years ago
Alternatives and similar repositories for Video2Text
Users that are interested in Video2Text are comparing it to the libraries listed below
Sorting:
- ☆75Updated 4 years ago
- Efficient violence detection in surveillance videos using Human Skeletons and Motion Estimation☆58Updated 2 years ago
- Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and T…☆644Updated last week
- 这是一个基于Pytorch平台、Transformer框架实现的视频描述生成 (Video Captioning) 深度学习模型。 视频描述生成任务指的是:输入一个视频,输出一句描述整个视频内容的文字(前提是视频较短且可以用一句话来描述)。本repo主要目的是帮助视力障碍…☆99Updated 3 years ago
- ☆50Updated 3 years ago
- A large scale video database for violence detection, which has 2,000 video clips containing violent or non-violent behaviours.☆463Updated last year
- ☆147Updated 3 years ago
- ☆64Updated 5 years ago
- Video Captioning is an encoder decoder mode based on sequence to sequence learning☆140Updated last year
- Abnormal Human Behaviors Detection/ Road Accident Detection From Surveillance Videos/ Real-World Anomaly Detection in Surveillance Videos…☆169Updated 3 years ago
- [AAAI 2020] Official implementation of VAANet for Emotion Recognition☆83Updated 2 years ago
- Transformer & CNN Image Captioning model in PyTorch.☆43Updated 2 years ago
- Implemented 3 different architectures to tackle the Image Caption problem, i.e, Merged Encoder-Decoder - Bahdanau Attention - Transformer…☆40Updated 4 years ago
- Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.☆73Updated 4 years ago
- Video to Text: Natural language description generator for some given video. [Video Captioning]☆359Updated 3 years ago
- Implementation of ViViT: A Video Vision Transformer☆556Updated 4 years ago
- Violence detection in videos using Deep Learning (CNNs + LSTMs). 98.5% video accuracy and 97.81% frame level accuracy (with threshold=3) …☆100Updated 3 years ago
- Make video classification on UCF101 using CNN and RNN based on Pytorch framework.☆64Updated 2 years ago
- Using VideoBERT to tackle video prediction☆134Updated 4 years ago
- Key-frame based summarization of videos☆30Updated 3 years ago
- Implementation code of the work "Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning"☆94Updated last year
- This is the official implementation of 2023 ICCV paper "EmoSet: A large-scale visual emotion dataset with rich attributes".☆63Updated last year
- PyTorch implementation of a collections of scalable Video Transformer Benchmarks.☆305Updated 3 years ago
- Emotion Detection in PyTorch☆49Updated 2 years ago
- An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"☆1,023Updated last year
- Online and real-time violence recognition☆15Updated 3 years ago
- New generated dataset for fight detection in surveillance cameras.☆191Updated 4 years ago
- Code for the paper: "Efficient Two-Stream Network for Violence Detection Using Separable Convolutional LSTM"☆60Updated 2 years ago
- Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020☆126Updated last year
- The notebook explains the various steps to obtain the results of publication: "Is Space-Time Attention All You Need for Video Understandi…☆42Updated 4 years ago