jwehrmann / lmtd
Labeled Movie Trailer Dataset
☆15Updated 6 years ago
Related projects: ⓘ
- Code implementation for our ICPR, 2020 paper titled "Improving Word Recognition using Multiple Hypotheses and Deep Embeddings"☆21Updated 3 years ago
- ☆20Updated 9 months ago
- A repository for extract CNN features from videos using pytorch☆68Updated last year
- EDUVSUM is a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features to identify important tempo…☆17Updated 6 months ago
- Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.☆49Updated 2 years ago
- Code and dataset release for "PACS: A Dataset for Physical Audiovisual CommonSense Reasoning" (ECCV 2022)☆10Updated last year
- ☆36Updated 2 years ago
- A dataset of debunked and verified user-generated videos.☆26Updated 5 years ago
- ☆29Updated 3 years ago
- Listen to Look: Action Recognition by Previewing Audio (CVPR 2020)☆126Updated 3 years ago
- ☆40Updated this week
- COMIC: This is the code repo of our TMM2019 work titled "COMIC: Towards a Compact Image Captioning Model with Attention".☆15Updated 3 years ago
- Engaged in research to help improve to boost text sentiment analysis using facial features from video using machine learning.☆32Updated 6 years ago
- A Video Summarization framework for implementation and benchmark of Deep Learning models☆34Updated last week
- Identity-Aware Multi-Sentence Video Description☆13Updated last year
- ☆30Updated 3 years ago
- A unified framework to jointly model images, text, and human attention traces.☆78Updated 3 years ago
- AViD Dataset: Anonymized Videos from Diverse Countries☆55Updated last year
- Code for the paper: Audio-Visual Model Distillation Using Acoustic Images☆20Updated last year
- ☆15Updated 3 years ago
- ☆20Updated 4 years ago
- M-VAD Names Dataset. Multimedia Tools and Applications (2019)☆21Updated 5 years ago
- Code of Dense Relational Captioning☆67Updated last year
- EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset☆52Updated 3 years ago
- ☆74Updated this week
- A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"☆78Updated 2 years ago
- A length-controllable and non-autoregressive image captioning model.☆66Updated 3 years ago
- ☆38Updated this week
- Audio Visual Scene-Aware Dialog (AVSD) Challenge at the 10th Dialog System Technology Challenge (DSTC)☆27Updated 2 years ago
- Localized Narratives☆79Updated 3 years ago