[AAAI 2023 (Oral)] CrissCross: Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity
☆26Jul 11, 2023Updated 2 years ago
Alternatives and similar repositories for CrissCross
Users that are interested in CrissCross are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 2 years ago
- Code for "Repetitive Activity Counting by Sight and Sound"☆24Oct 29, 2021Updated 4 years ago
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- Official code for Tell Me What You See: A Zero-Shot Action Recognition Method Based on Natural Language Descriptions (Multimedia Tools an…☆12Mar 8, 2024Updated 2 years ago
- ☆13Jul 20, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆19Jan 30, 2023Updated 3 years ago
- This repo is for action recognition using Kinetics dataset with pytorch☆11Aug 5, 2019Updated 6 years ago
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation☆42Dec 23, 2023Updated 2 years ago
- BEAR: a new BEnchmark on video Action Recognition☆46Apr 21, 2024Updated last year
- Learning from Temporal Gradient for Semi-supervised Action Recognition (CVPR 2022)☆30Dec 1, 2022Updated 3 years ago
- ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)☆16Jan 18, 2024Updated 2 years ago
- Official This-Is-My Dataset published in CVPR 2023☆16Jul 18, 2024Updated last year
- Contrastive Bayesian Analysis for Deep Metric Learning and an Integrated Deep Metric Learning Toolbox Based on Pytorch☆13Dec 27, 2022Updated 3 years ago
- [AAAI 2023] AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work☆23Dec 7, 2025Updated 4 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- [CVPR 2023] Egocentric Audio-Visual Object Localization☆27Jan 6, 2024Updated 2 years ago
- ActMAD: Activation Matching to Align Distributions for Test-Time-Training (CVPR 2023)☆21Jun 27, 2023Updated 2 years ago
- Code and instructions accompanying ICCV'23 paper Protoype-based Dataset Comparison☆18Dec 15, 2023Updated 2 years ago
- Code release for "MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning"☆11Oct 11, 2024Updated last year
- This repository hosts code for converting the original MLP Mixer models (JAX) to TensorFlow.☆15Sep 29, 2021Updated 4 years ago
- [AAAI 2023 Oral] Official pytorch implementation of "Towards Good Practices for Missing Modality Robust Action Recognition"☆24Dec 1, 2022Updated 3 years ago
- A Benchmark and Evaluation Suite for Zero-shot Singing Voice Synthesis☆24Feb 11, 2026Updated 2 months ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆25Nov 23, 2024Updated last year
- ☆58Dec 2, 2025Updated 4 months ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Deep learning based food instance segmentation using synthetic dataset. We provide the trainig, evaluate and inference codes. Also the tr…☆14Jun 1, 2023Updated 2 years ago
- ☆29Feb 15, 2026Updated 2 months ago
- Code for Temporal Data Augmentations (ECCVW 2020)☆37Aug 18, 2020Updated 5 years ago
- A voice spoofing detection system, based on paper presented at ICSPIS 2021☆10Feb 11, 2022Updated 4 years ago
- ☆14Apr 25, 2025Updated 11 months ago
- Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'☆33Nov 7, 2023Updated 2 years ago
- ☆13Jun 2, 2024Updated last year
- A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Features☆10Oct 5, 2022Updated 3 years ago
- Implementation of CaiT models in TensorFlow and ImageNet-1k checkpoints. Includes code for inference and fine-tuning.☆12Jun 9, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Implementation for "SoundCLR: Contrastive Learning of Representations For Improved Environmental Sound Classification," in pytorch.☆28Jan 18, 2024Updated 2 years ago
- An R package for analyzing linguistic alignment between partners in conversation transcripts☆14Updated this week
- Unofficial Implementation of Google Deepmind's paper `Objects that Sound`☆83May 7, 2018Updated 7 years ago
- Baseline method for audio-visual sound event localization and detection task of DCASE 2023 challenge☆65Mar 19, 2025Updated last year
- Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)☆91Oct 24, 2022Updated 3 years ago
- ☆13Jan 8, 2024Updated 2 years ago
- Official PyTorch implementation of Vision DiffMask, a post-hoc interpretation method for vision models.☆32Mar 5, 2024Updated 2 years ago