yuboona / some-script-to-help-using-Montreal-Forced-AlignerLinks
Some script for helping using Montreal Forced Aligner, maily for transforming Hanzi character to pinyin and extrat pause time from .textgrid files.
☆14Updated last year
Alternatives and similar repositories for some-script-to-help-using-Montreal-Forced-Aligner
Users that are interested in some-script-to-help-using-Montreal-Forced-Aligner are comparing it to the libraries listed below
Sorting:
- ICASSP2022 TTS&VC Summary☆14Updated 3 years ago
- ☆11Updated 3 years ago
- Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale☆27Updated 2 years ago
- Adaptive Vocoder for Custom Voice☆60Updated 3 years ago
- Please visit https://thuhcsi.github.io/SnakeGAN/☆37Updated 2 years ago
- Unofficial Pytorch implementation of SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speake…☆57Updated 2 years ago
- ICASSP 2021 accepted papers in term of voice conversion (VC)☆18Updated 4 years ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Updated last year
- Voice conversion model for real-time speech synthesis using PPG (Phonetic PosteriorGram) as an intermediate feature, written in Pytorch.☆28Updated 3 years ago
- Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge☆21Updated 3 years ago
- 一个开源的中文歌声合成数据集。An open-source Chinese singing synthesizing dataset.☆23Updated 6 years ago
- Simulation of parallel synthesis with LPCNet vocoder☆14Updated 5 years ago
- TTS Text Analyzer☆32Updated 2 years ago
- Official PyTorch implementation of "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-Speech Synthesis…☆15Updated 7 months ago
- Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms☆18Updated 2 years ago
- This repository provides UNOFFICIAL Bunched LPCNet implementations with Pytorch.☆14Updated 4 years ago
- ☆23Updated 3 months ago
- ☆33Updated last month
- faster inference☆28Updated 8 months ago
- G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…☆14Updated last year
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Updated 2 years ago
- ☆14Updated 3 years ago
- ☆25Updated 3 years ago
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆48Updated last month
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆50Updated last year
- ☆22Updated 2 years ago
- FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)☆26Updated last year
- Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…☆33Updated last year
- Synthesized singing voice demos of WeSinger 2 paper.☆27Updated 2 years ago
- RepVgg + HiFiGAN☆34Updated 3 years ago