Implementation of the paper "Spoken Language Recognition using X-vectors" in Pytorch
☆106Jul 20, 2020Updated 5 years ago
Alternatives and similar repositories for x-vector-pytorch
Users that are interested in x-vector-pytorch are comparing it to the libraries listed below
Sorting:
- Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196☆320Nov 11, 2020Updated 5 years ago
- Time delay neural network (TDNN) implementation in Pytorch using unfold method☆204Nov 21, 2019Updated 6 years ago
- A pytorch implementation of xvector embedding☆79Mar 28, 2020Updated 5 years ago
- Voice conversion training with 109 speakers with limited training samples☆35Dec 21, 2020Updated 5 years ago
- A collection of utilities for handling IPA phones.☆26Sep 24, 2023Updated 2 years ago
- Voice conversion model for real-time speech synthesis using PPG (Phonetic PosteriorGram) as an intermediate feature, written in Pytorch.☆29Mar 3, 2022Updated 3 years ago
- Adversarial attack and defense strategies for deep speaker recognition systems☆42Feb 18, 2021Updated 5 years ago
- ☆13Sep 25, 2024Updated last year
- Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)☆790Apr 11, 2024Updated last year
- Taiwanese Speech Synthesis with Tacotron2☆25Oct 2, 2022Updated 3 years ago
- ☆19Mar 22, 2024Updated last year
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- Code for paper "Using Phonetic Posteriorgram Based Frame Pairing for Segmental Accent Conversion"☆36Jan 15, 2020Updated 6 years ago
- xvector model on jtubespeech☆47Nov 5, 2023Updated 2 years ago
- Google's TPGST reimplementation.☆34Dec 11, 2019Updated 6 years ago
- Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus☆13Oct 15, 2022Updated 3 years ago
- ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for…☆43Dec 17, 2020Updated 5 years ago
- Experiments on speech recognition robustness to accents and dialects☆12Apr 2, 2019Updated 6 years ago
- Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra☆16Dec 10, 2024Updated last year
- Predicts the level of noise and reverberation on your audiofiles☆178Jun 17, 2025Updated 8 months ago
- ☆99Dec 20, 2017Updated 8 years ago
- ☆25Mar 6, 2024Updated last year
- An Open Source Tools for Speaker Recognition☆635Aug 5, 2024Updated last year
- Transfer Learning from Monolingual ASR to Transcription-free Cross-lingual Voice Conversion☆40Oct 22, 2022Updated 3 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆45May 25, 2021Updated 4 years ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆150Jan 16, 2024Updated 2 years ago
- G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…☆14Dec 30, 2023Updated 2 years ago
- This repository creates speaker diarization recipes to be used within the egs folder of kaldi.☆17Aug 12, 2024Updated last year
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- ASR & TTS joint training, asr, tts, machine speech chain☆16Oct 16, 2021Updated 4 years ago
- Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams (Interspeech'19)☆148Jul 6, 2023Updated 2 years ago
- Resources that make every language unique☆26Feb 21, 2026Updated last week
- ToneNet: A CNN Model of Tone Classification of Mandarin Chinese☆20Nov 27, 2019Updated 6 years ago
- MSR Identity Toolkit v1.0☆17Aug 18, 2017Updated 8 years ago
- This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repo…☆34Mar 31, 2023Updated 2 years ago
- ☆31Jul 13, 2023Updated 2 years ago
- Bilingual-TTS (Japanese and Korean)☆32Jul 1, 2023Updated 2 years ago
- This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1☆114May 22, 2019Updated 6 years ago
- Code for "Phoneme Segmentation Using Self-Supervised Speech Models", Strgar & Harwath, Proceedings of the IEEE Spoken Language Technology…☆55Nov 4, 2022Updated 3 years ago