kiyoon / verb_ambiguityLinks
Official implementation of "An Action Is Worth Multiple Words: Handling Ambiguity in Action Recognition", BMVC 2022
☆12Updated 2 years ago
Alternatives and similar repositories for verb_ambiguity
Users that are interested in verb_ambiguity are comparing it to the libraries listed below
Sorting:
- Shapley values for assessing the importance of each frame in a video☆17Updated 4 years ago
- This is an implementation of the Unsupervised Learning of Video Representations via Dense Trajectory Clustering algorithm.☆15Updated 4 years ago
- [WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https…☆36Updated 2 years ago
- Implementations of Transformers for Video☆23Updated 4 years ago
- This dataset contains about 110k images annotated with the depth and occlusion relationships between arbitrary objects. It enables resear…☆16Updated 4 years ago
- ☆22Updated last year
- AViD Dataset: Anonymized Videos from Diverse Countries☆56Updated 2 years ago
- ☆72Updated last year
- ☆26Updated last year
- RareAct: A video dataset of unusual interactions☆32Updated 4 years ago
- ☆73Updated 3 years ago
- Visualizing the learned space-time attention using Attention Rollout☆37Updated 3 years ago
- CLIP-It! Language-Guided Video Summarization☆74Updated 4 years ago
- ☆16Updated 3 years ago
- Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.☆48Updated 4 years ago
- Command-line tool for downloading and extending the RedCaps dataset.☆48Updated last year
- Official codes for paper "Pretext-Contrastive Learning: Toward Good Practices in Self-supervised Video Representation Leaning".☆13Updated 3 years ago
- ☆43Updated 4 years ago
- Video Noise Contrastive Estimation☆66Updated last year
- PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)☆33Updated 2 years ago
- Datasets, transforms and samplers for video in PyTorch☆88Updated last year
- Code for reproducing experiments in "How Useful is Self-Supervised Pretraining for Visual Tasks?"☆60Updated 11 months ago
- Learning Representational Invariances for Data-Efficient Action Recognition☆33Updated 3 years ago
- ☆48Updated 3 years ago
- Rethinking Nearest Neighbors for Visual Classification☆31Updated 3 years ago
- We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances…☆47Updated 3 years ago
- [arXiv 2020] Video Representation Learning with Visual Tempo Consistency☆24Updated 5 years ago
- Sapsucker Woods 60 Audiovisual Dataset☆15Updated 2 years ago
- Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch☆58Updated 4 years ago
- Code for Learning to Learn Language from Narrated Video☆33Updated last year