☆17Aug 27, 2025Updated 6 months ago
Alternatives and similar repositories for speech
Users that are interested in speech are comparing it to the libraries listed below
Sorting:
- Korean ASR Corpus generated from TEDx talks☆27Jan 11, 2019Updated 7 years ago
- Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectr…☆17Apr 27, 2023Updated 2 years ago
- ☆15May 8, 2021Updated 4 years ago
- PyTorch implementation of Retriever: Learning Content-Style Representation☆12Jan 27, 2023Updated 3 years ago
- ☆70Jan 7, 2021Updated 5 years ago
- A pakage for crawling audio from Youtube☆42Aug 8, 2023Updated 2 years ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆80Sep 27, 2023Updated 2 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- **ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degrada…☆24Sep 27, 2022Updated 3 years ago
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form. It has a highly es…☆19Jun 14, 2021Updated 4 years ago
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆15Mar 26, 2022Updated 3 years ago
- A CSRankings-like index for speech researchers☆35Oct 16, 2024Updated last year
- Jejueo Datasets for Machine Translation and Speech Synthesis☆83Feb 19, 2020Updated 6 years ago
- Official implementation of BVAE-TTS☆173Sep 26, 2022Updated 3 years ago
- Simple tool for speech dataset augmentation for modeling various prosodies.☆14Jan 14, 2021Updated 5 years ago
- PyTorch Implementation of Robust and fine-grained prosody control of end-to-end speech synthesis☆41Feb 20, 2022Updated 4 years ago
- Visualizing the Music Transformer attention☆27Nov 15, 2019Updated 6 years ago
- Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.☆30Sep 16, 2022Updated 3 years ago
- Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing☆89Sep 6, 2024Updated last year
- Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset☆15Apr 7, 2025Updated 11 months ago
- Google's TPGST reimplementation.☆34Dec 11, 2019Updated 6 years ago
- Lightweight speaker anonymization [IEEE SLT2021]☆27Jun 6, 2022Updated 3 years ago
- Torch implementation of NANSY, Neural Analysis and Synthesis, arXiv:2110.14513☆64Feb 13, 2023Updated 3 years ago
- Text-To-Speech for NotebookLM☆39Jul 20, 2025Updated 8 months ago
- Deep Convolutional TTS pytorch implementation☆27Jul 2, 2019Updated 6 years ago
- This repository provides data and code for "Vox Populi, Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription" paper.☆16Jul 22, 2021Updated 4 years ago
- Tacotron 2 - PyTorch implementation with faster-than-realtime inference☆30May 28, 2020Updated 5 years ago
- Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"☆17Dec 10, 2020Updated 5 years ago
- Implementation of the AlignTTS☆77Jul 6, 2023Updated 2 years ago
- Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language☆43Feb 28, 2018Updated 8 years ago
- Implementation of Korean FastSpeech2☆215Jan 29, 2023Updated 3 years ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- A Tensorflow Implementation of the FastSpeech 2: Fast and High-Quality End-to-End Text to Speech☆11Aug 12, 2020Updated 5 years ago
- Deep Neural Pitch Extractor for Voice Conversion and TTS Training☆147Aug 22, 2022Updated 3 years ago
- ☆17Apr 14, 2023Updated 2 years ago
- ☆12Jul 6, 2023Updated 2 years ago
- End-to-end Text-to-Speech with Generative Adversarial Networks☆20Feb 6, 2021Updated 5 years ago
- ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)☆223Apr 5, 2022Updated 3 years ago