joonson/syncnet_trainer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/joonson/syncnet_trainer)

joonson / syncnet_trainer

Disentangled Speech Embeddings using Cross-Modal Self-Supervision

☆167

Alternatives and similar repositories for syncnet_trainer

Users that are interested in syncnet_trainer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

joonson / syncnet_python
View on GitHub
Out of time: automated lip sync in the wild
☆894Apr 17, 2026Updated 3 months ago
joonson / voxceleb_unsupervised
View on GitHub
Augmentation adversarial training for self-supervised speaker recognition
☆77Aug 15, 2021Updated 4 years ago
clovaai / voxceleb_trainer
View on GitHub
In defence of metric learning for speaker recognition
☆1,170Apr 22, 2026Updated 2 months ago
a-nagrani / VoxSRC2020
View on GitHub
Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020
☆43Jul 17, 2020Updated 6 years ago
MRzzm / HDTF
View on GitHub
the dataset and code for "Flow-guided One-shot Talking Face Generation with a High-resolution Audio-visual Dataset"
☆429May 12, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
WeidiXie / VGG-Speaker-Recognition
View on GitHub
Utterance-level Aggregation For Speaker Recognition In The Wild
☆371Mar 24, 2023Updated 3 years ago
afourast / avobjects
View on GitHub
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
☆114Nov 16, 2020Updated 5 years ago
julianyulu / SyncNetCN
View on GitHub
Optimized Syncnet and Chinese enhanced version, EN and CN checkpoints released
☆11Nov 8, 2021Updated 4 years ago
vskadandale / vocalist
View on GitHub
Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
☆73Apr 7, 2024Updated 2 years ago
TaoRuijie / SEANet
View on GitHub
Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)
☆32Feb 28, 2025Updated last year
smeetrs / deep_avsr
View on GitHub
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
☆244Feb 15, 2024Updated 2 years ago
zexupan / MuSE
View on GitHub
☆42Nov 22, 2024Updated last year
VITA-Group / AutoSpeech
View on GitHub
[InterSpeech 2020] "AutoSpeech: Neural Architecture Search for Speaker Recognition" by Shaojin Ding*, Tianlong Chen*, Xinyu Gong, Weiwei …
☆206Dec 8, 2022Updated 3 years ago
JaesungHuh / av-diarization
View on GitHub
Audio-visual diarization pipeline used for creating VoxConverse dataset
☆22Jun 6, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
jefflai108 / pytorch-kaldi-neural-speaker-embeddings
View on GitHub
A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.
☆136Jan 27, 2020Updated 6 years ago
HuangZiliAndy / RPNSD
View on GitHub
PyTorch implementation of RPNSD
☆60Jun 17, 2024Updated 2 years ago
msh9184 / contrastive-equilibrium-learning
View on GitHub
☆21Apr 6, 2021Updated 5 years ago
auspicious3000 / SpeechSplit-Demo
View on GitHub
Unsupervised Speech Decomposition via Triple Information Bottleneck
☆14Apr 29, 2020Updated 6 years ago
dc3ea9f / vico_challenge_baseline
View on GitHub
☆105Jul 5, 2023Updated 3 years ago
tavihalperin / AV-sync
View on GitHub
Python implementation of the paper " Dynamic Temporal Alignment of Speech to Lips"
☆32May 16, 2019Updated 7 years ago
madhavlab / 2022_syncnet
View on GitHub
SyncNet for Time Synchronization
☆30Mar 13, 2023Updated 3 years ago
yiranran / Audio-driven-TalkingFace-HeadPose
View on GitHub
Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose" (Arxiv 2020) and "Predicting Personalize…
☆772Dec 15, 2023Updated 2 years ago
jlian2 / Improved-Voice-Conversion-with-Conditional-DSVAE
View on GitHub
Demo for 2022 Interspeech
☆29Jun 14, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
YunjinPark / awesome_talking_face_generation
View on GitHub
☆838Nov 19, 2025Updated 8 months ago
zexupan / reentry
View on GitHub
☆18Nov 22, 2024Updated last year
MahdiHajibabaei / unified-embedding
View on GitHub
Code and instruction on replicating the experiments done in paper: Unified Hypersphere Embedding for Speaker Recognition
☆32Jul 14, 2019Updated 7 years ago
facebookresearch / av_hubert
View on GitHub
A self-supervised learning framework for audio-visual speech
☆992Dec 7, 2023Updated 2 years ago
facebookresearch / VisualVoice
View on GitHub
Audio-Visual Speech Separation with Cross-Modal Consistency
☆250Jul 25, 2023Updated 2 years ago
BUTSpeechFIT / x-vector-kaldi-tf
View on GitHub
Tensorflow implementation of x-vector topology on top of Kaldi recipe
☆118Nov 5, 2019Updated 6 years ago
lin9x / AV-Sepformer
View on GitHub
☆65Jun 28, 2023Updated 3 years ago
rgzn-aiyun / melgan-cpu
View on GitHub
Real-time melgan based on cpu ！！！
☆13Dec 3, 2019Updated 6 years ago
guanjz20 / StyleSync_PyTorch
View on GitHub
PyTorch implementation of "StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator"
☆215Aug 8, 2023Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
lelechen63 / talking-head-generation-survey
View on GitHub
Official github repo for paper "What comprises a good talking-head video generation?: A Survey and Benchmark"
☆91Dec 8, 2022Updated 3 years ago
uniBruce / Mead
View on GitHub
MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation [ECCV2020]
☆305Jul 7, 2024Updated 2 years ago
nc-ai / speech
View on GitHub
☆17Aug 27, 2025Updated 10 months ago
cvqluu / GE2E-Loss
View on GitHub
Pytorch implementation of Generalized End-to-End Loss for speaker verification
☆88Apr 23, 2019Updated 7 years ago
manojpamk / pytorch_xvectors
View on GitHub
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
☆321Nov 11, 2020Updated 5 years ago
Sxjdwang / TalkLip
View on GitHub
☆429Nov 1, 2023Updated 2 years ago
Delay-Xili / uCTRL
View on GitHub
☆15Apr 6, 2023Updated 3 years ago