Disentangled Speech Embeddings using Cross-Modal Self-Supervision
☆167Apr 12, 2020Updated 6 years ago
Alternatives and similar repositories for syncnet_trainer
Users that are interested in syncnet_trainer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Out of time: automated lip sync in the wild☆891Apr 17, 2026Updated 2 months ago
- Augmentation adversarial training for self-supervised speaker recognition☆77Aug 15, 2021Updated 4 years ago
- In defence of metric learning for speaker recognition☆1,169Apr 22, 2026Updated 2 months ago
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020☆43Jul 17, 2020Updated 5 years ago
- the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"☆432May 12, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆115Nov 16, 2020Updated 5 years ago
- Utterance-level Aggregation For Speaker Recognition In The Wild☆372Mar 24, 2023Updated 3 years ago
- Optimized Syncnet and Chinese enhanced version, EN and CN checkpoints released☆11Nov 8, 2021Updated 4 years ago
- Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices☆73Apr 7, 2024Updated 2 years ago
- Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)☆32Feb 28, 2025Updated last year
- A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.☆244Feb 15, 2024Updated 2 years ago
- ☆42Nov 22, 2024Updated last year
- [InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei …☆207Dec 8, 2022Updated 3 years ago
- Audio-visual diarization pipeline used for creating VoxConverse dataset☆22Jun 6, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.☆136Jan 27, 2020Updated 6 years ago
- PyTorch implementation of RPNSD☆60Jun 17, 2024Updated 2 years ago
- ☆21Apr 6, 2021Updated 5 years ago
- Unsupervised Speech Decomposition via Triple Information Bottleneck☆14Apr 29, 2020Updated 6 years ago
- ☆105Jul 5, 2023Updated 2 years ago
- Python implementation of the paper " Dynamic Temporal Alignment of Speech to Lips"☆32May 16, 2019Updated 7 years ago
- SyncNet for Time Synchronization☆30Mar 13, 2023Updated 3 years ago
- Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose" (Arxiv 2020) and "Predicting Personalize…☆774Dec 15, 2023Updated 2 years ago
- ☆840Nov 19, 2025Updated 7 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Demo for 2022 Interspeech☆29Jun 14, 2022Updated 4 years ago
- ☆18Nov 22, 2024Updated last year
- Code and instruction on replicating the experiments done in paper: Unified Hypersphere Embedding for Speaker Recognition☆32Jul 14, 2019Updated 6 years ago
- A self-supervised learning framework for audio-visual speech☆987Dec 7, 2023Updated 2 years ago
- Audio-Visual Speech Separation with Cross-Modal Consistency☆250Jul 25, 2023Updated 2 years ago
- Tensorflow implementation of x-vector topology on top of Kaldi recipe☆118Nov 5, 2019Updated 6 years ago
- ☆65Jun 28, 2023Updated 3 years ago
- Visual Speech Recognition For Low-Resource Languages with Automatic Labels (ICASSP 2024)☆17Mar 17, 2025Updated last year
- PyTorch implementation of "StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator"☆215Aug 8, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Real-time melgan based on cpu !!!☆13Dec 3, 2019Updated 6 years ago
- Official github repo for paper "What comprises a good talking-head video generation?: A Survey and Benchmark"☆91Dec 8, 2022Updated 3 years ago
- MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]☆304Jul 7, 2024Updated last year
- Pytorch implementation of Generalized End-to-End Loss for speaker verification☆88Apr 23, 2019Updated 7 years ago
- ☆17Aug 27, 2025Updated 10 months ago
- Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196☆322Nov 11, 2020Updated 5 years ago
- ☆428Nov 1, 2023Updated 2 years ago