oleges1 / quartznet-pytorch
Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]
☆26Updated 3 years ago
Related projects: ⓘ
- Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.☆15Updated 3 years ago
- ☆51Updated this week
- Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021☆39Updated 3 years ago
- Phonetically-Oriented Word Error Rate☆31Updated 5 years ago
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆45Updated last year
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆36Updated last year
- An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.☆52Updated 2 years ago
- The VoxTube dataset official repository☆60Updated 7 months ago
- Segment a given audio into utterances using a trained end-to-end ASR model.☆73Updated 3 years ago
- Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.☆67Updated 3 years ago
- Example code for a neural transducer model.☆58Updated 7 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆39Updated 2 months ago
- Implementation of the AlignTTS☆76Updated last year
- Fre-GAN: Adversarial Frequency-consistent Audio Synthesis☆101Updated 3 years ago
- This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repo…☆26Updated last year
- Phoneme Boundary Detection using Learnable Segmental Features (ICASSP 2020)☆78Updated 2 years ago
- Clustering-based methods for overlapping diarization☆68Updated 8 months ago
- A PyTorch implementation of the universal neural vocoder☆66Updated 3 years ago
- ☆31Updated 2 weeks ago
- Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP…☆59Updated 3 years ago
- multilingual speech aligner☆70Updated 10 months ago
- This repository provides a multi-mode and multi-speaker expressive speech synthesis framework, including multi-attentive Tacotron, DurIAN…☆74Updated last year
- Pytorch implementation of "Efficienttts: an efficient and high-quality text-to-speech architecture"☆115Updated 2 years ago
- Avocodo: Generative Adversarial Network for Artifact-free Vocoder☆115Updated 2 years ago
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆60Updated 6 months ago
- PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INT…☆33Updated 2 years ago
- Transcripts and segmentation for the Blizzard 2013 audiobooks also known as the Lessac or Blizzard 2013 dataset.☆43Updated 4 years ago
- streaming attention networks for end-to-end automatic speech recognition☆55Updated 4 years ago
- MOS score prediction by fine-tuned wav2vec2.0 model☆135Updated last year
- ☆29Updated 2 years ago