walkoncross / voxceleb2-download

Tools for downloading VoxCeleb2 dataset

☆26

Related projects ⓘ

Alternatives and complementary repositories for voxceleb2-download

zcxu-eric / AVA-AVD
☆45Updated last year
ms-dot-k / Visual-Context-Attentional-GAN
PyTorch implementation of "Lip to Speech Synthesis with Visual Context Attentional GAN" (NeurIPS2021)
☆22Updated 8 months ago
v-manhlt3 / Disentangle-VAE-for-VC
☆21Updated 2 years ago
joannahong / Lip2Wav-pytorch
a PyTorch implementation of Lip2Wav
☆49Updated 2 years ago
MKT-Dataoceanai / CNVSRC2023Baseline
Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)
☆21Updated 6 months ago
vskadandale / vocalist
Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
☆61Updated 7 months ago
naver-ai / facetts
☆48Updated last year
joonson / syncnet_trainer
Disentangled Speech Embeddings using Cross-Modal Self-Supervision
☆154Updated 4 years ago
Tinglok / CVC
CVC: Contrastive Learning for Non-parallel Voice Conversion (INTERSPEECH 2021, in PyTorch)
☆57Updated 2 years ago
ahaliassos / raven
Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)
☆55Updated 4 months ago
KunZhou9646 / Emovox
This is the implementation of the paper "Emotion Intensity and its Control for Emotional Voice Conversion".
☆81Updated 2 years ago
YoungSeng / SRD-VC
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)
☆112Updated 9 months ago
biggytruck / SpeechSplit2
Official implementation of SpeechSplit2
☆128Updated 2 years ago
ms-dot-k / Lip-to-Speech-Synthesis-in-the-Wild
PyTorch implementation of "Lip to Speech Synthesis in the Wild with Multi-task Learning" (ICASSP2023)
☆65Updated 8 months ago
KunZhou9646 / controllable_evc_code
This is the code for controllable EVC framework for seen and unseen emotion generation.
☆41Updated 3 years ago
tavihalperin / AV-sync
Python implementation of the paper " Dynamic Temporal Alignment of Speech to Lips"
☆30Updated 5 years ago
DanielMengLiu / AudioVisualLip
☆20Updated 9 months ago
Jiang-Yidi / TS-TalkNet
INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues
☆46Updated last year
JuanFMontesinos / VoViT
VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer
☆34Updated last year
X-LANCE / MSDWILD
[INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.
☆41Updated 9 months ago
CODEJIN / AutoVC
☆28Updated 4 years ago
maum-ai / sane-tts
SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
☆11Updated last year
Labmem-Zhouyx / CDFSE_FastSpeech2
The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synth…
☆81Updated last year
TaoRuijie / AVCleanse
ICASSP 2023: 'Speaker recognition with two-step multi-modal deep cleansing'
☆35Updated 2 years ago
leibniz-future-lab / SelfDistill-SER
☆19Updated last year
hhguo / MSMC-TTS
Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS
☆162Updated 7 months ago
facebookresearch / facestar
Facestar dataset. High quality audio-visual recordings of human conversational speech.
☆104Updated 2 years ago
Moon0316 / T2A
Project page for "Improving Few-shot Learning for Talking Face System with TTS Data Augmentation" for ICASSP2023
☆83Updated last year
KimythAnly / AGAIN-VC
This is the official implementation of the paper AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance No…
☆111Updated 3 years ago
joannahong / AV-RelScore
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…
☆29Updated last year