facebookresearch / VisualVoice
Audio-Visual Speech Separation with Cross-Modal Consistency
☆221Updated last year
Related projects ⓘ
Alternatives and complementary repositories for VisualVoice
- Deep-Learning-Based Audio-Visual Speech Enhancement and Separation☆203Updated last year
- Disentangled Speech Embeddings using Cross-Modal Self-Supervision☆154Updated 4 years ago
- A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.☆209Updated 9 months ago
- Auto-AVSR: Lip-Reading Sentences Project☆178Updated 7 months ago
- Code for the Active Speakers in Context Paper (CVPR2020)☆53Updated 3 years ago
- An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-S…☆390Updated last year
- The PyTorch Code and Model In "Learn an Effective Lip Reading Model without Pains", (https://arxiv.org/abs/2011.07557), which reaches the…☆149Updated last year
- Official repository for RawNet, RawNet2, and RawNet3☆360Updated 8 months ago
- Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!☆340Updated 2 years ago
- The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmente…☆106Updated 11 months ago
- Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-R…☆308Updated last year
- VGGSound: A Large-scale Audio-Visual Dataset☆291Updated 3 years ago
- The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)☆103Updated 7 months ago
- Audio-Visual Speech Recognition using Sequence to Sequence Models☆81Updated 4 years ago
- Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments☆106Updated 8 months ago
- Co-Separating Sounds of Visual Objects (ICCV 2019)☆94Updated last year
- Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation☆508Updated last month
- Include some core functions and model to handle speech separation☆154Updated 3 years ago
- PPG-Based Voice Conversion☆329Updated 2 years ago
- Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset☆57Updated 2 years ago
- This is the GitHub page for publicly available emotional speech data.☆322Updated 2 years ago
- UniSpeech - Large Scale Self-Supervised Learning for Speech☆434Updated 7 months ago
- ICASSP 2022: 'Self-supervised Speaker Recognition with Loss-gated Learning'☆87Updated last year
- An open source dataset for source separation☆380Updated 9 months ago
- Official code for the paper "Visual Speech Enhancement Without A Real Visual Stream" published at WACV 2021☆103Updated 5 months ago
- [INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.☆41Updated 9 months ago
- [InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei …☆208Updated last year
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆111Updated 4 years ago
- A PyTorch implementation of "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" (see recipes in aps framework https:/…☆209Updated last year
- Reading list for research topics in Sound AI☆166Updated 3 months ago