NVIDIA / NeMo-speech-data-processor
A toolkit for processing speech data and creating speech datasets
☆110Updated this week
Alternatives and similar repositories for NeMo-speech-data-processor
Users that are interested in NeMo-speech-data-processor are comparing it to the libraries listed below
Sorting:
- ☆100Updated 2 weeks ago
- NeMo text processing for ASR and TTS☆327Updated 3 weeks ago
- A TTS model that makes a speaker speak new languages☆76Updated 10 months ago
- Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva☆88Updated 2 months ago
- VoiceBox neural network implementation☆107Updated 9 months ago
- LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆105Updated last month
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆159Updated 7 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 7 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆51Updated 10 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆194Updated 8 months ago
- ONNX and TensorRT implementation of Whisper☆63Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆83Updated last year
- An unofficial PyTorch implementation of VALL-E☆87Updated last week
- ☆359Updated 8 months ago
- Extensions to YAML syntax for better python interaction☆66Updated last year
- Audio tokenization, in the fastest way possible!☆52Updated 8 months ago
- ☆84Updated last year
- Audio Codec Speech processing Universal PERformance Benchmark☆253Updated last month
- ☆143Updated 7 months ago
- Official implementation of the TTS model Lina-Speech☆165Updated 4 months ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆76Updated last year
- The VoxTube dataset official repository☆68Updated last year
- ☆59Updated last year
- Unofficial implementation of NVIDIA P-Flow TTS paper☆222Updated 4 months ago
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆27Updated last year
- Collection of Open Source Speech Data☆157Updated 6 months ago
- Implementation of Google's USM speech model in Pytorch☆31Updated last month
- ConMamba for Automatic Speech Recognition☆72Updated 9 months ago
- Tunable pipelines☆33Updated 2 months ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆144Updated last year