oleges1 / quartznet-pytorchLinks
Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]
☆27Updated 3 years ago
Alternatives and similar repositories for quartznet-pytorch
Users that are interested in quartznet-pytorch are comparing it to the libraries listed below
Sorting:
- Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.☆16Updated 4 years ago
- The VoxTube dataset official repository☆68Updated last year
- Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.☆88Updated 2 years ago
- A PyTorch implementation of the universal neural vocoder☆67Updated 4 years ago
- Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.☆69Updated 4 years ago
- Fre-GAN: Adversarial Frequency-consistent Audio Synthesis☆106Updated 3 years ago
- Convert kaldi feature extraction and nnet3 models into Tensorflow Lite models. Currently aimed at converting kaldi's x-vector models and …☆20Updated 2 years ago
- ☆25Updated 9 months ago
- Tensorflow2 based implementation of ContextNet, an improved convolutional rnn-transducer-based architecture for end-to-end speech recogni…☆17Updated 4 years ago
- Pytorch implementation of "Efficienttts: an efficient and high-quality text-to-speech architecture"☆116Updated 3 years ago
- An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.☆53Updated 2 years ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Updated 4 years ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆65Updated 4 years ago
- Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP…☆60Updated 4 years ago
- Avocodo: Generative Adversarial Network for Artifact-free Vocoder☆120Updated 2 years ago
- Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021☆39Updated 3 years ago
- A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.☆41Updated 3 years ago
- End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM☆39Updated 2 years ago
- This is the source code of the paper "Neural grapheme-to-phoneme conversion with pretrained grapheme models☆46Updated 3 years ago
- Segment a given audio into utterances using a trained end-to-end ASR model.☆73Updated 4 years ago
- This repository provides a multi-mode and multi-speaker expressive speech synthesis framework, including multi-attentive Tacotron, DurIAN…☆74Updated 2 years ago
- Clustering-based methods for overlapping diarization☆81Updated last year
- Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" present…☆25Updated 2 years ago
- ☆56Updated 11 months ago
- Implementation of audio degradation processes☆102Updated 9 years ago
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆62Updated 5 years ago
- A pytroch implementation of the FB-MelGAN☆89Updated 5 years ago
- ☆36Updated last month
- The codebase for Data-driven general-purpose voice activity detection.☆94Updated last year
- PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INT…☆38Updated 3 years ago