pritamqu / CrissCrossView external linksLinks
[AAAI 2023 (Oral)] CrissCross: Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity
☆25Jul 11, 2023Updated 2 years ago
Alternatives and similar repositories for CrissCross
Users that are interested in CrissCross are comparing it to the libraries listed below
Sorting:
- Official code for Tell Me What You See: A Zero-Shot Action Recognition Method Based on Natural Language Descriptions (Multimedia Tools an…☆12Mar 8, 2024Updated last year
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 2 years ago
- This repo is for action recognition using Kinetics dataset with pytorch☆11Aug 5, 2019Updated 6 years ago
- SMILE: A Multimodal Dataset for Understanding Laughter☆13Jun 15, 2023Updated 2 years ago
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation☆41Dec 23, 2023Updated 2 years ago
- Official This-Is-My Dataset published in CVPR 2023☆16Jul 18, 2024Updated last year
- ☆18Jan 30, 2023Updated 3 years ago
- ☆13Jul 20, 2024Updated last year
- Official Implementation of our Interspeech 2021 paper "An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure …☆16Feb 15, 2022Updated 4 years ago
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆22Nov 8, 2023Updated 2 years ago
- [AAAI 2023 Oral] Official pytorch implementation of "Towards Good Practices for Missing Modality Robust Action Recognition"☆23Dec 1, 2022Updated 3 years ago
- ActMAD: Activation Matching to Align Distributions for Test-Time-Training (CVPR 2023)☆21Jun 27, 2023Updated 2 years ago
- collection of pitch (f0, fundamental frequency) detection algorithms with unified interface☆24Nov 25, 2024Updated last year
- [AAAI 2023] AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work☆23Dec 7, 2025Updated 2 months ago
- [CVPR 2023] Egocentric Audio-Visual Object Localization☆26Jan 6, 2024Updated 2 years ago
- Baseline method for audio-visual sound event localization and detection task of DCASE 2023 challenge☆60Mar 19, 2025Updated 10 months ago
- Official PyTorch implementation of Vision DiffMask, a post-hoc interpretation method for vision models.☆32Mar 5, 2024Updated last year
- [CVPR2022] Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition☆28Sep 15, 2023Updated 2 years ago
- We build a novel self-supervised segmentation pipeline to segment transparent liquids (clear water) placed inside transparent containers.☆26Nov 22, 2022Updated 3 years ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆25Nov 23, 2024Updated last year
- ☆28Jul 1, 2023Updated 2 years ago
- A probabilistic model to cluster survival data in a variational deep clustering setting☆30Aug 3, 2022Updated 3 years ago
- ☆29Jul 4, 2024Updated last year
- ☆46Apr 30, 2021Updated 4 years ago
- PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)☆33Feb 5, 2023Updated 3 years ago
- Repository of the WACV'24 paper "Can CLIP Help Sound Source Localization?"☆34Feb 21, 2025Updated 11 months ago
- Localizing Visual Sounds the Hard Way☆82Jul 6, 2022Updated 3 years ago
- [CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?☆35Apr 27, 2023Updated 2 years ago
- [WACV'22] Code repository for the paper "Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting", https…☆36Aug 16, 2022Updated 3 years ago
- [arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"☆138Aug 23, 2025Updated 5 months ago
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆88Jun 18, 2024Updated last year
- ☆31Jun 18, 2021Updated 4 years ago
- An R package for analyzing linguistic alignment between partners in conversation transcripts☆13Jan 30, 2026Updated 2 weeks ago
- ☆13Jan 8, 2024Updated 2 years ago
- Code for Temporal Data Augmentations (ECCVW 2020)☆37Aug 18, 2020Updated 5 years ago
- Unsupervised Film Genre Classification using Spatio-Temporal Contrastive Learning☆32Aug 3, 2023Updated 2 years ago
- ☆10Oct 13, 2024Updated last year
- Repository for the code assignment of the Deep Learning 1 course, Fall 2021 edition☆10Oct 31, 2022Updated 3 years ago