fabianoluzbr / neural-g2p-portuguese
Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form. It has a highly essential role for natural language processing, text-to-speech synthesis and automatic speech recognition systems. This project was adapted from https://github.com/hajix/G2P.
☆17Updated 3 years ago
Related projects: ⓘ
- ☆18Updated 3 months ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆10Updated 9 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- Prosodic Speech Segmentation with Transformers☆22Updated 6 months ago
- ☆15Updated 3 years ago
- Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"☆17Updated 3 years ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆16Updated 6 months ago
- Code for paper titled "Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0" submitt…☆16Updated 4 years ago
- Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectr…☆13Updated last year
- Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering☆18Updated 11 months ago
- ☆28Updated this week
- A list of papers for child ASR☆24Updated 5 months ago
- Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model☆26Updated last year
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆13Updated last year
- Simple tool for speech dataset augmentation for modeling various prosodies.☆14Updated 3 years ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 2 years ago
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆17Updated last year
- ☆30Updated last year
- Please visit: https://thuhcsi.github.io/icassp2021-emotion-tts/☆32Updated last year
- Pytorch implementation of "f0-consistent many-to-many non-parallel voice conversion via conditional autoencoder"☆28Updated 3 years ago
- ☆10Updated last year
- A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS☆19Updated last month
- Qualtric or Qualtreat? Generate Qualtrics listening tests for Text-To-Speech evaluations.☆30Updated 2 months ago
- ☆15Updated this week
- Deep Speech Distances PyTorch☆27Updated 2 years ago
- A toolset for easy formant extraction and visualization from wav files and TTS models☆29Updated 2 years ago
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆60Updated 6 months ago
- wake-up word emotion recognition [APSIPA 2022]☆17Updated last year
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆29Updated last month
- FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)☆21Updated 6 months ago