zakuro-ai / asrLinks
ASRDeepspeech x Sakura-ML (English/Japanese) with deepspeech2 model in pytorch with support from Zakuro AI.
☆68Updated 2 years ago
Alternatives and similar repositories for asr
Users that are interested in asr are comparing it to the libraries listed below
Sorting:
- ☆226Updated last year
- context labels and pronunciation data for JSUT corpus☆74Updated 4 years ago
- ESPnet Model Zoo☆256Updated 2 years ago
- Onnx wrapper for espnet infrernce model☆169Updated last month
- ☆89Updated 4 years ago
- Repository for the paper: VoiceMe: Personalized voice generation in TTS☆125Updated 3 years ago
- real time japanese speech recognition translator using wav2vec2☆39Updated 3 years ago
- Python wrapper for OpenJTalk☆234Updated 6 months ago
- One-button-press forced aligner for Japanese, using Julius.☆47Updated 2 years ago
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram☆257Updated last year
- Deep neural network (DNN) for noise reduction, removal of background music, and speech separation☆172Updated 2 years ago
- ☆67Updated 3 months ago
- ☆32Updated 2 years ago
- A fork of open_jtalk☆64Updated 6 months ago
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech☆229Updated 3 years ago
- ☆28Updated 4 years ago
- Voice Activity Detection (VAD) using deep learning.☆200Updated 5 years ago
- Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow☆129Updated 4 years ago
- ☆22Updated 4 years ago
- VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network☆321Updated last year
- Voice based gender recognition using Mel-frequency cepstrum coefficients (MFCC) and Gaussian mixture models (GMM)☆217Updated 2 years ago
- A public domain single speaker Japanese speech dataset☆61Updated last year
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆69Updated 2 years ago
- xvector model on jtubespeech☆45Updated last year
- iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform☆263Updated 2 months ago
- Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three differen…☆249Updated 3 years ago
- JVS (Japanese versatile speech) コーパスの自作のラベル☆31Updated 4 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆341Updated last year
- ☆200Updated 3 years ago
- Official implementation of the source-filter HiFiGAN vocoder☆260Updated 2 years ago