deepaudio / deepaudio-speakerLinks
neural network based speaker embedder
☆25Updated 2 years ago
Alternatives and similar repositories for deepaudio-speaker
Users that are interested in deepaudio-speaker are comparing it to the libraries listed below
Sorting:
- ☆33Updated 3 years ago
- PPSpeech: Phrase based Parallel End-to-End TTS System☆35Updated 4 years ago
- Open Source Speech/Text Data on AI☆18Updated 2 years ago
- ☆56Updated 2 years ago
- Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.☆45Updated 4 years ago
- ☆64Updated 3 years ago
- Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration☆34Updated 3 years ago
- Transfer Learning from Monolingual ASR to Transcription-free Cross-lingual Voice Conversion☆40Updated 2 years ago
- Python implementation of CTC beam search decoder + agnostic LM scorer☆19Updated 4 years ago
- Torch-based tool for quantizing high-dimensional vectors using additive codebooks☆54Updated 3 years ago
- ☆36Updated 2 years ago
- TTS Text Analyzer☆32Updated last year
- streaming attention networks for end-to-end automatic speech recognition☆55Updated 5 years ago
- Rich Prosody Diversity Modelling with Phone-level Mixture Density Network☆45Updated 3 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- Implementation of the AlignTTS☆76Updated last year
- SpeechNAS-Better-Trade-off-between-Latency-and-Accuracy-for-Large-Scale-Speaker-Verification☆30Updated 2 years ago
- Torch implementation of Whisper-guided DDPM based Voice Conversion☆49Updated 2 years ago
- This repo contains conv-tasnet for basis-melgan. If you want to get code of basis-melgan, please refer to FastVocoder.☆20Updated 3 years ago
- ☆25Updated 7 months ago
- Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation☆39Updated 4 years ago
- Decoders from Kaldi using OpenFst☆28Updated 4 months ago
- multilingual speech aligner☆74Updated last year
- Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"☆17Updated 4 years ago
- video cut powered by AI☆25Updated 2 years ago
- An unofficial implementation of https://arxiv.org/abs/2005.05106☆46Updated 4 years ago
- ☆24Updated 3 years ago
- Implementation of "FastSpeech: Fast, Robust and Controllable Text to Speech"☆52Updated 5 years ago
- A handy dataset of noises for ASR☆21Updated 6 years ago
- Source code for INTERSPEECH2020☆11Updated 4 years ago