VinAIResearch / XPhoneBERTLinks
XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)
☆323Updated 10 months ago
Alternatives and similar repositories for XPhoneBERT
Users that are interested in XPhoneBERT are comparing it to the libraries listed below
Sorting:
- Vi_G2P or ViG2P: G2P package for Vietnamese: based on vPhon and phonology knowledge to convert Raw text - Graphoneme to IPA☆87Updated 11 months ago
- Easy-to-Use Speech MOS predictors☆288Updated last year
- Update ASR paper everyday☆227Updated this week
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3☆200Updated last year
- Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for V…☆225Updated 10 months ago
- Unofficial implementation of NVIDIA P-Flow TTS paper☆223Updated 5 months ago
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆77Updated 6 months ago
- Multilingual G2P in 100 languages☆327Updated 2 years ago
- Training code for FAcodec presented in NaturalSpeech3☆210Updated 9 months ago
- Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch☆271Updated last year
- Train the next generation of TTS systems.☆165Updated 8 months ago
- finetune llm part for spark-tts model☆79Updated 2 months ago
- ☆140Updated last year
- Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech☆236Updated last year
- An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for Custom Voice"☆97Updated 2 years ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆162Updated last year
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆119Updated 2 years ago
- ☆46Updated 9 months ago
- Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)☆459Updated this week
- It's a repository for implementations of neural speech editing algorithms.☆198Updated last year
- Python - NSW package for Vietnamese: Normalization system to convert numbers, abbreviations, and words that cannot be pronounced into syl…☆60Updated 5 months ago
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆251Updated 4 months ago
- ACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis☆138Updated 8 months ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆145Updated last year
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆157Updated last week
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆67Updated last year
- [ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"☆343Updated 9 months ago
- The Open Source Code of UniAudio☆563Updated 10 months ago
- Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"☆192Updated last year
- [INTERSPEECH 2024] The official implementation of EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for …☆146Updated 2 weeks ago