jwehrmann / lmtd
Labeled Movie Trailer Dataset
☆16Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for lmtd
- Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.☆49Updated 2 years ago
- ☆21Updated 11 months ago
- PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "☆40Updated 3 years ago
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆111Updated 4 years ago
- Code implementation for our ICPR, 2020 paper titled "Improving Word Recognition using Multiple Hypotheses and Deep Embeddings"☆21Updated 3 years ago
- A dataset of debunked and verified user-generated videos.☆26Updated 5 years ago
- Audio Visual Instance Discrimination with Cross-Modal Agreement☆127Updated 3 years ago
- Listen to Look: Action Recognition by Previewing Audio (CVPR 2020)☆127Updated 3 years ago
- Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)☆90Updated 2 years ago
- Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)☆80Updated 3 months ago
- Code for the paper: Audio-Visual Model Distillation Using Acoustic Images☆20Updated last year
- ☆31Updated 3 years ago
- Cross-model active contrastive coding☆21Updated 3 years ago
- Code and dataset release for "PACS: A Dataset for Physical Audiovisual CommonSense Reasoning" (ECCV 2022)☆11Updated last year
- Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection (ECCV 2022)☆65Updated last year
- EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset☆52Updated 3 years ago
- ☆59Updated 2 years ago
- A non-JIT version implementation / replication of CLIP of OpenAI in pytorch☆34Updated 3 years ago
- ☆17Updated 3 years ago
- Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018☆172Updated 3 years ago
- ☆37Updated 2 years ago
- ☆15Updated 3 years ago
- Use CLIP to represent video for Retrieval Task☆69Updated 3 years ago
- Unofficial Implementation of Google Deepmind's paper `Objects that Sound`☆83Updated 6 years ago
- Implementations of Transformers for Video☆24Updated 3 years ago
- PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning☆86Updated 3 years ago
- Pytorch Code for S2IGAN☆41Updated 4 years ago
- Official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval". CVPR 2022☆95Updated 2 years ago
- Official implementation of FOP method as described in "Fusion and Orthogonal Projection for Improved Face-Voice Association"☆17Updated 9 months ago