Phonemes and durations labeling based on whisper small
☆11Jul 7, 2024Updated last year
Alternatives and similar repositories for fast-phasr
Users that are interested in fast-phasr are comparing it to the libraries listed below
Sorting:
- unofficial pytorch implementation of HiFi-GAN with fast MISR.☆15Mar 21, 2023Updated 3 years ago
- Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale☆28Aug 4, 2023Updated 2 years ago
- Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023☆27Apr 27, 2023Updated 2 years ago
- ☆19Feb 2, 2023Updated 3 years ago
- A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes☆11Oct 19, 2023Updated 2 years ago
- Mutiband version of HIFIGAN☆19Nov 6, 2020Updated 5 years ago
- ☆46Apr 16, 2023Updated 2 years ago
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS☆24Jan 29, 2022Updated 4 years ago
- ☆47Aug 31, 2024Updated last year
- Audio tokenization, in the fastest way possible!☆53Aug 26, 2024Updated last year
- Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge☆21Jul 25, 2022Updated 3 years ago
- SOFA_AI: Singing-Oriented Forced Aligner for Automatic Inference☆25May 28, 2024Updated last year
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"☆34Nov 23, 2023Updated 2 years ago
- My vocoder experiments☆31Jul 26, 2025Updated 7 months ago
- An implementation of Charactr, Inc's "WavThruVec: Latent speech representation as intermediate features for neural speech synthesis"☆29Sep 6, 2023Updated 2 years ago
- ☆22Apr 4, 2023Updated 2 years ago
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆36Jan 17, 2024Updated 2 years ago
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆69Nov 1, 2024Updated last year
- Sequence alignement methods with helpers for PyTorch.☆24Nov 30, 2022Updated 3 years ago
- ☆25Jan 24, 2023Updated 3 years ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated last year
- ☆12Feb 3, 2026Updated last month
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Jun 1, 2024Updated last year
- ☆15Nov 11, 2024Updated last year
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆40Jul 10, 2023Updated 2 years ago
- ☆71Jul 13, 2023Updated 2 years ago
- A neural speech codec based on discrete WavLM representations☆25Aug 28, 2024Updated last year
- GPT for FACodec☆13Mar 25, 2024Updated last year
- a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engine☆42Jan 17, 2025Updated last year
- A Japanese G2P tool based on pyopenjtalk☆25Aug 6, 2022Updated 3 years ago
- 基于vits fastspeech2 visinger的tts模型☆24Mar 9, 2023Updated 3 years ago
- Experiments for "Automatic Calibration and Error Correction for Large Language Models via Pareto Optimal Self-Supervision"☆14Aug 4, 2023Updated 2 years ago
- ☆19Mar 22, 2024Updated 2 years ago
- A repository comprising of code for generation of noisy speech data from clean data using deep learning methods☆16Jul 12, 2021Updated 4 years ago
- text to speech☆10Mar 19, 2024Updated 2 years ago