facebookresearch / VisualVoiceLinks
Audio-Visual Speech Separation with Cross-Modal Consistency
☆232Updated last year
Alternatives and similar repositories for VisualVoice
Users that are interested in VisualVoice are comparing it to the libraries listed below
Sorting:
- Deep-Learning-Based Audio-Visual Speech Enhancement and Separation☆210Updated 2 years ago
- A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.☆234Updated last year
- Code for the Active Speakers in Context Paper (CVPR2020)☆54Updated 4 years ago
- Research code for the paper "Fine-tuning wav2vec2 for speaker recognition" found at https://arxiv.org/abs/2109.15053☆145Updated 3 years ago
- Official repository for RawNet, RawNet2, and RawNet3☆382Updated last year
- Disentangled Speech Embeddings using Cross-Modal Self-Supervision☆160Updated 5 years ago
- Include some core functions and model to handle speech separation☆155Updated 4 years ago
- Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments☆109Updated last year
- ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'☆399Updated last year
- ICASSP 2022: 'Self-supervised Speaker Recognition with Loss-gated Learning'☆90Updated 2 years ago
- The repository for IEEE CVPR 2023 (A Light Weight Model for Active Speaker Detection)☆147Updated 3 months ago
- Audio-Visual Active Speaker Detection with PyTorch on AVA-ActiveSpeaker dataset☆64Updated 3 years ago
- The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".☆265Updated last year
- The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmente…☆119Updated last year
- [InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei …☆208Updated 2 years ago
- Executable code based on Google articles☆164Updated 2 years ago
- Augmentation adversarial training for self-supervised speaker recognition☆79Updated 3 years ago
- A summary of speech data augment algorithms☆69Updated 4 years ago
- ☆45Updated 2 years ago
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆17Updated 3 years ago
- A PyTorch implementation of End-to-End Neural Diarization☆109Updated 2 years ago
- Audio-Visual Speech Recognition using Sequence to Sequence Models☆82Updated 5 years ago
- ☆39Updated 7 months ago
- Speaker embedding (d-vector) trained with GE2E loss☆282Updated last year
- PPG-Based Voice Conversion☆341Updated 2 years ago
- Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper☆142Updated last year
- Accepted by TMM 2022☆16Updated 2 years ago
- VGGSound: A Large-scale Audio-Visual Dataset☆322Updated 3 years ago
- Implementation of "Duration Informed Attention Network for Multimodal Synthesis" paper in PyTorch.☆183Updated 4 years ago
- [WACV 2023] Audio-Visual Efficient Conformer (AVEC) for Robust Speech Recognition☆97Updated 2 years ago