Disentangled Speech Embeddings using Cross-Modal Self-Supervision
☆166Apr 12, 2020Updated 5 years ago
Alternatives and similar repositories for syncnet_trainer
Users that are interested in syncnet_trainer are comparing it to the libraries listed below
Sorting:
- Out of time: automated lip sync in the wild☆870Jan 23, 2024Updated 2 years ago
- Augmentation adversarial training for self-supervised speaker recognition☆78Aug 15, 2021Updated 4 years ago
- In defence of metric learning for speaker recognition☆1,163Mar 26, 2024Updated last year
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020☆43Jul 17, 2020Updated 5 years ago
- the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"☆422May 12, 2024Updated last year
- [InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei …☆209Dec 8, 2022Updated 3 years ago
- Audio-visual diarization pipeline used for creating VoxConverse dataset☆21Jun 6, 2025Updated 8 months ago
- Real-time melgan based on cpu !!!☆13Dec 3, 2019Updated 6 years ago
- Utterance-level Aggregation For Speaker Recognition In The Wild☆372Mar 24, 2023Updated 2 years ago
- ☆17Aug 27, 2025Updated 6 months ago
- PyTorch implementation of RPNSD☆60Jun 17, 2024Updated last year
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆115Nov 16, 2020Updated 5 years ago
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12May 13, 2024Updated last year
- Pytorch implementation of Generalized End-to-End Loss for speaker verification☆88Apr 23, 2019Updated 6 years ago
- A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.☆136Jan 27, 2020Updated 6 years ago
- ☆105Jul 5, 2023Updated 2 years ago
- Python implementation of the paper " Dynamic Temporal Alignment of Speech to Lips"☆32May 16, 2019Updated 6 years ago
- A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.☆242Feb 15, 2024Updated 2 years ago
- Unsupervised Speech Decomposition via Triple Information Bottleneck☆14Apr 29, 2020Updated 5 years ago
- Optimized Syncnet and Chinese enhanced version, EN and CN checkpoints released☆11Nov 8, 2021Updated 4 years ago
- Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices☆73Apr 7, 2024Updated last year
- ☆21Apr 6, 2021Updated 4 years ago
- ☆25Mar 12, 2022Updated 3 years ago
- SyncNet for Time Synchronization☆30Mar 13, 2023Updated 2 years ago
- Anonymous ICLR Submission☆14Sep 25, 2019Updated 6 years ago
- Code and instruction on replicating the experiments done in paper: Unified Hypersphere Embedding for Speaker Recognition☆32Jul 14, 2019Updated 6 years ago
- Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose" (Arxiv 2020) and "Predicting Personalize…☆774Dec 15, 2023Updated 2 years ago
- A self-supervised learning framework for audio-visual speech☆969Dec 7, 2023Updated 2 years ago
- PyTorch implementation for Deep Griffin-Lim Iteration paper(https://arxiv.org/abs/1903.03971)☆39Oct 12, 2019Updated 6 years ago
- ☆15May 8, 2021Updated 4 years ago
- Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)☆29Feb 28, 2025Updated last year
- PyTorch implementation of "StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator"☆214Aug 8, 2023Updated 2 years ago
- [CVPR 2023] Official code for paper: Learning to Dub Movies via Hierarchical Prosody Models.☆111Jun 21, 2024Updated last year
- The project page repo for Neural Dubber.☆30Sep 20, 2023Updated 2 years ago
- Interface for Controllable Expressive Talking Machine☆40Sep 20, 2025Updated 5 months ago
- MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]☆290Jul 7, 2024Updated last year
- ☆42Nov 22, 2024Updated last year
- ☆18Nov 22, 2024Updated last year
- Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllabl…☆160Jun 5, 2025Updated 8 months ago