IS2AI / MultilingualASR
☆11Updated 3 years ago
Alternatives and similar repositories for MultilingualASR:
Users that are interested in MultilingualASR are comparing it to the libraries listed below
- ☆11Updated last year
- ☆9Updated 3 years ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 11 months ago
- C++ version of pyannote audio overlapped speech detection pipeline☆10Updated 11 months ago
- ☆11Updated last year
- ☆14Updated last week
- Kaldi style neural network training in pytorch for use in place of nnet3 in Kaldi.☆26Updated 5 months ago
- ☆13Updated 3 years ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- ☆9Updated 5 years ago
- ☆20Updated 5 months ago
- ☆9Updated 3 months ago
- ☆10Updated 3 months ago
- Russian phonetical transcription☆9Updated last year
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆8Updated 2 years ago
- Training BERT for punctuation task☆10Updated 4 years ago
- Prosodic Speech Segmentation with Transformers☆25Updated 10 months ago
- Code for the paper: How Much Context Does My Attention-Based ASR System Need?☆11Updated this week
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆20Updated 10 months ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- Python package of MP-SENet from Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.☆11Updated 2 months ago
- Normalize Text in Russian☆26Updated last year
- ☆11Updated last year
- WarpRNNT loss ported in Numba CPU/CUDA for Pytorch☆16Updated 2 years ago
- This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…☆11Updated 4 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level…☆10Updated last month
- T5-based (russian) text normalization☆20Updated 11 months ago
- silero-vad pytorch implement☆12Updated last month
- A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.☆28Updated 2 months ago