sandraavila / vsummLinks
This repository contains the data (datasets, video/user summaries, CUS evaluation, and results) from the paper "VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method." We created the repository in 2011 at (inactive) Google sites.
☆15Updated 8 months ago
Alternatives and similar repositories for vsumm
Users that are interested in vsumm are comparing it to the libraries listed below
Sorting:
- [AAAI 2023 (Oral)] CrissCross: Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity☆25Updated last year
- This is the official implementation of the paper "MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision…☆29Updated last year
- Best Papers of Top Venues like CVPR, NeurIPS, ICLR, ICML, ICCV, ECCV, ...☆104Updated 2 months ago
- Sapsucker Woods 60 Audiovisual Dataset☆15Updated 2 years ago
- ☆9Updated 4 years ago
- Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"☆140Updated 3 years ago
- Official code for TeD-SPAD: Temporal Distinctiveness for Self-supervised Privacy-preservation for video Anomaly Detection, accepted at IC…☆18Updated 4 months ago
- Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.☆52Updated 3 years ago
- ICCV 2021☆33Updated 3 years ago
- Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation☆63Updated 3 years ago
- ☆76Updated 2 years ago
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorch☆97Updated last year
- Official Pytorch implementation of "Probabilistic Cross-Modal Embedding" (CVPR 2021)☆132Updated last year
- Official Pytorch implementation of EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens [ICML2024].☆27Updated last year
- Official code repository for SPAct: Self-supervised Privacy Preservation for Action Recognition [CVPR-2022]☆21Updated 3 years ago
- Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.☆73Updated 4 years ago
- NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)☆48Updated last year
- Official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval." CVPR 2022☆108Updated 2 years ago
- Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)☆57Updated last year
- Code and dataset release for "PACS: A Dataset for Physical Audiovisual CommonSense Reasoning" (ECCV 2022)☆14Updated 2 years ago
- This repository contains the code for our ECCV 2022 paper "Temporal and cross-modal attention for audio-visual zero-shot learning"☆24Updated 2 years ago
- ☆22Updated last year
- Official Implementation of "Geometric Multimodal Contrastive Representation Learning" (https://arxiv.org/abs/2202.03390)☆28Updated 5 months ago
- Pytorch code for ECCVW 2022 paper "Consistency-based Self-supervised Learning for Temporal Anomaly Localization"☆14Updated 11 months ago
- ☆55Updated 2 years ago
- Repository for research works and resources related to model reprogramming <https://arxiv.org/abs/2202.10629>☆61Updated last year
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆69Updated last year
- ☆120Updated 2 years ago
- Uma base de dados para estudo de regionalismos brasileiros através da voz.☆8Updated 2 years ago
- PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning☆88Updated 3 years ago