oleges1 / quartznet-pytorch
Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]
☆27Updated 3 years ago
Alternatives and similar repositories for quartznet-pytorch:
Users that are interested in quartznet-pytorch are comparing it to the libraries listed below
- Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.☆16Updated 4 years ago
- Phoneme Boundary Detection using Learnable Segmental Features (ICASSP 2020)☆81Updated 3 years ago
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆39Updated 4 years ago
- ☆29Updated 3 years ago
- Pytorch implementation of Generalized End-to-End Loss for speaker verification☆84Updated 5 years ago
- Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP…☆59Updated 4 years ago
- A PyTorch implementation of the universal neural vocoder☆67Updated 4 years ago
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆87Updated last year
- Phonetically-Oriented Word Error Rate☆34Updated 5 years ago
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020☆42Updated 4 years ago
- The VoxTube dataset official repository☆68Updated last year
- A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.☆40Updated 3 years ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Updated 3 years ago
- Collect Voice Conversion researches☆92Updated 2 weeks ago
- PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INT…☆37Updated 3 years ago
- Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021☆39Updated 3 years ago
- Fre-GAN: Adversarial Frequency-consistent Audio Synthesis☆103Updated 3 years ago
- Tensorflow2 based implementation of ContextNet, an improved convolutional rnn-transducer-based architecture for end-to-end speech recogni…☆17Updated 4 years ago
- Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.☆68Updated 4 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆48Updated 8 months ago
- ☆35Updated 2 weeks ago
- multilingual speech aligner☆72Updated last year
- Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" present…☆25Updated 2 years ago
- A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.☆136Updated 5 years ago
- This repository provides a multi-mode and multi-speaker expressive speech synthesis framework, including multi-attentive Tacotron, DurIAN…☆74Updated 2 years ago
- Implementation of the AlignTTS☆76Updated last year
- An unofficial implementation of https://arxiv.org/abs/2005.05106☆46Updated 4 years ago
- Clustering-based methods for overlapping diarization☆78Updated last year
- End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM☆39Updated 2 years ago
- An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.☆53Updated 2 years ago