Code and dataset release for "PACS: A Dataset for Physical Audiovisual CommonSense Reasoning" (ECCV 2022)
☆17Dec 20, 2022Updated 3 years ago
Alternatives and similar repositories for PACS
Users that are interested in PACS are comparing it to the libraries listed below
Sorting:
- Implementation of Practical Facial Landmark Detector (PFLD) on Pytorch☆14Jul 23, 2023Updated 2 years ago
- Sapsucker Woods 60 Audiovisual Dataset☆17Oct 7, 2022Updated 3 years ago
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"☆146Jun 1, 2022Updated 3 years ago
- [CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?☆35Apr 27, 2023Updated 2 years ago
- Pre-training Cross-modal Transformer for Audio-and-Language Representations☆38Apr 20, 2021Updated 4 years ago
- [CVPR 2023] Official implementation of the paper: Fine-grained Audible Video Description☆76Dec 4, 2023Updated 2 years ago
- Implementation of "Interleaved Latent Visual Reasoning with Selective Perceptual Modeling".☆44Jan 21, 2026Updated last month
- 1st place solution to the DCASE 2020 - Task 5 - Urban Sound Tagging with Spatiotemporal Context☆16Dec 8, 2022Updated 3 years ago
- Deployed a facial emotion recognition using neural network model which predicts the emotion from faces in images, videos and live feed fr…☆11May 2, 2021Updated 4 years ago
- This repository shows how to implement a basic model for multimodal entailment.☆10Aug 17, 2021Updated 4 years ago
- ☆16Sep 29, 2025Updated 5 months ago
- ☆10Nov 10, 2021Updated 4 years ago
- IRFL: Image Recognition of Figurative Language☆11Nov 30, 2023Updated 2 years ago
- structured attention encoder☆13Jun 6, 2018Updated 7 years ago
- notebooks to finetune `bert-small-amharic`, `bert-mini-amharic`, and `xlm-roberta-base` models using an Amharic text classification datas…☆11May 10, 2024Updated last year
- Reference implementation and test synthetic data for Sorted Center Time echo density measure for acoustic impulse responses☆15Mar 18, 2020Updated 5 years ago
- Code for the article "Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning", Outstanding Paper at EMNLP20…☆10Nov 7, 2021Updated 4 years ago
- ☆12Jan 4, 2022Updated 4 years ago
- [CVPR 2024] Official repository of ST_GT☆10Sep 15, 2024Updated last year
- ☆13Feb 8, 2017Updated 9 years ago
- A CNN audio classifier via spectrogram images.☆10Jul 21, 2017Updated 8 years ago
- A fine multimodality fusion network :)☆11Aug 9, 2021Updated 4 years ago
- ChangeIt dataset with more than 2600 hours of video with state-changing actions published at CVPR 2022☆11Mar 23, 2022Updated 3 years ago
- ☆110Dec 23, 2022Updated 3 years ago
- Domain Adaptation and Adapters☆16Feb 28, 2023Updated 3 years ago
- Multimodal Affective Analysis Using Hierarchical Attention Strategy☆12Dec 7, 2018Updated 7 years ago
- Cornell Tech CS5670 Introduction to Computer Vision Projects Repo☆13Nov 22, 2022Updated 3 years ago
- More Robots than the Swarm.☆14Feb 9, 2026Updated 3 weeks ago
- Acoustic Scene Classification using transfer learning on VGGish pre-trained model☆11Jan 3, 2018Updated 8 years ago
- ☆13Mar 25, 2021Updated 4 years ago
- ☆12Nov 15, 2022Updated 3 years ago
- Work in progress Meta Quest Pro face and eye tracking utilities☆17Sep 5, 2023Updated 2 years ago
- The material is covered in my YouTube playlist "Data Wrangling with Python" available on YUNIKARN.☆15Dec 9, 2025Updated 2 months ago
- ☆28Jul 24, 2025Updated 7 months ago
- A Tensorflow implementation of Speech Emotion Recognition using Audio signals and Text Data☆12May 16, 2022Updated 3 years ago
- This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…☆13May 25, 2023Updated 2 years ago
- Labeled Movie Trailer Dataset☆16Mar 23, 2018Updated 7 years ago
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"☆19Jan 18, 2026Updated last month