Derpimort / VGGVox-PyTorchLinks

Implementing VGGVox for Speaker Identification on VoxCeleb1 dataset in PyTorch.

☆25

Alternatives and similar repositories for VGGVox-PyTorch

Users that are interested in VGGVox-PyTorch are comparing it to the libraries listed below

Sorting:

joonson / voxceleb_unsupervised
Augmentation adversarial training for self-supervised speaker recognition
☆79Updated 3 years ago
seongmin-kye / meta-SR
Pytorch implementation of Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs (Interspeech, 2020)
☆74Updated 4 years ago
msh9184 / contrastive-equilibrium-learning
☆21Updated 4 years ago
dr-pato / audio_visual_speech_enhancement
Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
☆109Updated last year
foamliu / Speaker-Embeddings
PyTorch implementation of a self-attentive speaker embedding
☆17Updated 5 years ago
glam-imperial / EmotionalConversionStarGAN
This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data…
☆135Updated 3 years ago
joaoantoniocn / AM-SincNet
The Additive Margin SincNet (AM-SincNet) is a new approach for speaker recognition problems which is based in the neural network architec…
☆45Updated last year
felixkreuk / UnsupSeg
Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation (INTERSPEECH 2020)
☆141Updated 2 years ago
yuyq96 / D-TDNN
PyTorch implementation of Densely Connected Time Delay Neural Network
☆88Updated 2 years ago
KunZhou9646 / controllable_evc_code
This is the code for controllable EVC framework for seen and unseen emotion generation.
☆44Updated 3 years ago
McDonnell-Research-Lab / DCASE2019-Task1
Acoustic Scene Classification Using Deep Residual Networks with Late Fusion of Separated High and Low Frequency Paths - McDonnell and Gao…
☆22Updated last year
juanmc2005 / SpeakerEmbeddingLossComparison
Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP…
☆60Updated 4 years ago
KunZhou9646 / Speaker-independent-emotional-voice-conversion-based-on-conditional-VAW-GAN-and-CWT
This is the implementation of our Interspeech 2020 paper "Converting anyone's emotion: towards speaker-independent emotional voice conver…
☆89Updated 4 years ago
Voice-Privacy-Challenge / Voice-Privacy-Challenge-2020
Baseline Recipe for VoicePrivacy Challenge 2020: https://www.voiceprivacychallenge.org/vp2020/docs/VoicePrivacy_2020_Eval_Plan_v1_3.pdf
☆63Updated 2 years ago
iariav / End-to-End-VAD
an Audio-Visual Voice Activity Detection using Deep Learning
☆49Updated 6 years ago
Tinglok / CVC
CVC: Contrastive Learning for Non-parallel Voice Conversion (INTERSPEECH 2021, in PyTorch)
☆57Updated 2 years ago
jefflai108 / pytorch-kaldi-neural-speaker-embeddings
A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.
☆136Updated 5 years ago
KrishnaDN / speech-emotion-recognition-using-self-attention
Implementation of the paper "Improved End-to-End Speech Emotion Recognition Using Self Attention Mechanism and Multitask Learning" From I…
☆57Updated 4 years ago
KrishnaDN / Attentive-Statistics-Pooling-for-Deep-Speaker-Embedding
Implementation of the paper "Attentive Statistics Pooling for Deep Speaker Embedding" in Pytorch
☆45Updated 5 years ago
joonson / voxsrc_2019
VoxSRC Challenge
☆31Updated 6 years ago
RaviSoji / plda
Probabilistic Linear Discriminant Analysis & classification, written in Python.
☆128Updated 3 years ago
pohanchi / AALBERT
The official repository for Audio ALBERT
☆65Updated 3 years ago
ktho22 / vctts
pytorch implementation of "Emotional Voice Conversion using Multitask Learning with Text-to-Speech", Accepted to ICASSP 2020
☆29Updated 2 years ago
nttcslab-sp / agevoxceleb
☆27Updated 3 years ago
a-nagrani / VoxSRC2020
Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020
☆42Updated 5 years ago
usc-sail / mica-speech-activity-detection
Robust Speech Activity Detection (SAD) in movie audio
☆26Updated 4 years ago
KimythAnly / AGAIN-VC
This is the official implementation of the paper AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance No…
☆115Updated 4 years ago
celebrity-audio-collection / videoprocess
CN-Celeb, a large-scale Chinese celebrities dataset published by Center for Speech and Language Technology (CSLT) at Tsinghua University.
☆74Updated 5 years ago
jefflai108 / ASSERT
JHU's system submission to the ASVspoof 2019 Challenge: Anti-Spoofing with Squeeze-Excitation and Residual neTworks (ASSERT).
☆57Updated 2 years ago
linhdvu14 / vggvox-speaker-identification
Speaker identification with VGGVox network
☆83Updated 6 years ago