NVIDIA / NeMo-text-processing
NeMo text processing for ASR and TTS
☆284Updated this week
Related projects ⓘ
Alternatives and complementary repositories for NeMo-text-processing
- ☆307Updated 2 months ago
- Multilingual G2P in 100 languages☆288Updated last year
- EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction☆232Updated 6 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆183Updated 2 months ago
- This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…☆477Updated 5 months ago
- Segment an audio file and obtain utterance alignments. (Python package)☆321Updated 6 months ago
- Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)☆287Updated this week
- CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus☆183Updated 2 years ago
- Onnx wrapper for espnet infrernce model☆156Updated last month
- Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch☆257Updated last year
- [ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"☆310Updated 2 months ago
- It's a repository for implementations of neural speech editing algorithms.☆191Updated 10 months ago
- Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - P…☆187Updated 3 weeks ago
- Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023☆202Updated last year
- ☆254Updated last year
- Unofficial implementation of NVIDIA P-Flow TTS paper☆216Updated 4 months ago
- Large, modern dataset for speech recognition☆646Updated 8 months ago
- UniSpeech - Large Scale Self-Supervised Learning for Speech☆434Updated 7 months ago
- Audio Large Language Models☆136Updated this week
- Easy-to-Use Speech MOS predictors☆230Updated last year
- Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.☆201Updated 10 months ago
- FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music gener…☆369Updated 9 months ago
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆118Updated 2 weeks ago
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆157Updated 8 months ago
- Predicts the level of noise and reverberation on your audiofiles☆138Updated 5 months ago
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆140Updated 11 months ago
- A toolkit for processing speech data and creating speech datasets☆88Updated this week
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆371Updated 3 weeks ago
- Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3☆365Updated 2 months ago
- A Survey on Neural Speech Synthesis https://arxiv.org/pdf/2106.15561.pdf☆363Updated 3 years ago