afourast/avobjects

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/afourast/avobjects)

afourast / avobjects

Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"

☆114

Alternatives and similar repositories for avobjects

Users that are interested in avobjects are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

uark-cviu / Right2Talk
View on GitHub
[ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach
☆20Aug 2, 2021Updated 4 years ago
ardasnck / learning_to_localize_sound_source
View on GitHub
Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes
☆102Dec 4, 2024Updated last year
krantiparida / awesome-audio-visual
View on GitHub
A curated list of different papers and datasets in various areas of audio-visual processing
☆775Jan 30, 2024Updated 2 years ago
danmic / av-se
View on GitHub
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
☆222Apr 16, 2023Updated 3 years ago
DTaoo / Discriminative-Sounding-Objects-Localization
View on GitHub
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)
☆61Jan 19, 2022Updated 4 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
facebookresearch / selavi
View on GitHub
This repo covers the implementation for Labelling unlabelled videos from scratch with multi-modal self-supervision, which learns clusters…
☆118Apr 26, 2021Updated 5 years ago
tuanchien / asd
View on GitHub
Active Speaker Detection
☆19Jun 19, 2020Updated 6 years ago
rhgao / co-separation
View on GitHub
Co-Separating Sounds of Visual Objects (ICCV 2019)
☆98Jul 25, 2023Updated 2 years ago
SheldonTsui / SepStereo_ECCV2020
View on GitHub
Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)
☆72Oct 20, 2020Updated 5 years ago
facebookresearch / VisualVoice
View on GitHub
Audio-Visual Speech Separation with Cross-Modal Consistency
☆250Jul 25, 2023Updated 2 years ago
DTaoo / Simplified_DMC
View on GitHub
A simplified version for DMC (Deep Multimodal Clustering for Unsupervised Audiovisual Learning)
☆19May 27, 2020Updated 6 years ago
TengdaHan / CoCLR
View on GitHub
[NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.
☆288Oct 10, 2021Updated 4 years ago
facebookresearch / 2.5D-Visual-Sound
View on GitHub
2.5D visual sound
☆121Jul 25, 2023Updated 2 years ago
Yuanbo2020 / Audio-Visual-VAD
View on GitHub
☆13May 9, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
andrewowens / multisensory
View on GitHub
Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
☆225Jul 17, 2019Updated 7 years ago
HumamAlwassel / XDC
View on GitHub
Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)
☆91Oct 24, 2022Updated 3 years ago
rohitrango / objects-that-sound
View on GitHub
Unofficial Implementation of Google Deepmind's paper `Objects that Sound`
☆83May 7, 2018Updated 8 years ago
hche11 / Localizing-Visual-Sounds-the-Hard-Way
View on GitHub
Localizing Visual Sounds the Hard Way
☆84Jul 6, 2022Updated 4 years ago
afourast / deep_lip_reading
View on GitHub
Code and models for evaluating a state-of-the-art lip reading network
☆196Mar 24, 2023Updated 3 years ago
YapengTian / CCOL-CVPR21
View on GitHub
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
☆26Nov 24, 2021Updated 4 years ago
zcxu-eric / AVA-AVD
View on GitHub
☆51Nov 24, 2022Updated 3 years ago
joonson / syncnet_trainer
View on GitHub
Disentangled Speech Embeddings using Cross-Modal Self-Supervision
☆167Apr 12, 2020Updated 6 years ago
roger-tseng / av-superb
View on GitHub
A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)
☆58Apr 17, 2024Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
JaesungHuh / av-diarization
View on GitHub
Audio-visual diarization pipeline used for creating VoxConverse dataset
☆22Jun 6, 2025Updated last year
hangzhaomit / Sound-of-Pixels
View on GitHub
Codebase for ECCV18 "The Sound of Pixels"
☆393Apr 25, 2022Updated 4 years ago
hohsiangwu / rethinking-visual-sound-localization
View on GitHub
Official implementation of the paper How to Listen? Rethinking Visual Sound Localization
☆18Apr 25, 2022Updated 4 years ago
okankop / ASDNet
View on GitHub
Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset
☆73Jan 18, 2022Updated 4 years ago
DTaoo / DMC
View on GitHub
Code for Deep Multimodal Clustering for Unsupervised Audiovisual Learning (CVPR2019)
☆15May 27, 2020Updated 6 years ago
stoneMo / MGN
View on GitHub
Official implementation for MGN
☆20Dec 22, 2022Updated 3 years ago
zaocan666 / DyViSE
View on GitHub
Dynamic vision-guided speaker embedding for audio-visual speaker diarization
☆12Jul 5, 2022Updated 4 years ago
joonson / syncnet_python
View on GitHub
Out of time: automated lip sync in the wild
☆894Apr 17, 2026Updated 3 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
WikiChao / Ego-AV-Loc
View on GitHub
[CVPR 2023] Egocentric Audio-Visual Object Localization
☆27Jan 6, 2024Updated 2 years ago
Sindhu-Hegde / multivsr
View on GitHub
Official code for the paper "Scaling Multilingual Visual Speech Recognition"
☆20Aug 15, 2025Updated 11 months ago
karreny / telling-left-from-right
View on GitHub
Project website for "Telling left from right: Learning spatial correspondence between sight and sound"
☆29Jun 6, 2022Updated 4 years ago
swimmiing / ACL-SSL
View on GitHub
Repository of the IJCV'26 & WACV'24 paper
☆34Apr 27, 2026Updated 2 months ago
TaoRuijie / TalkNet-ASD
View on GitHub
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
☆488Oct 23, 2023Updated 2 years ago
fuankarion / active-speakers-context
View on GitHub
Code for the Active Speakers in Context Paper (CVPR2020)
☆58May 19, 2021Updated 5 years ago
JusperLee / Looking-to-Listen-at-the-Cocktail-Party
View on GitHub
Executable code based on Google articles
☆166Dec 8, 2022Updated 3 years ago