oleges1 / quartznet-pytorch
Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]
☆26Updated 3 years ago
Alternatives and similar repositories for quartznet-pytorch:
Users that are interested in quartznet-pytorch are comparing it to the libraries listed below
- Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.☆16Updated 4 years ago
- A PyTorch implementation of the universal neural vocoder☆67Updated 4 years ago
- Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021☆39Updated 3 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆46Updated 6 months ago
- ☆29Updated 3 years ago
- PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INT…☆34Updated 2 years ago
- Pytorch implementation of "Efficienttts: an efficient and high-quality text-to-speech architecture"☆115Updated 3 years ago
- Sequence-to-sequence TTS based on Kyubyong's dc_tts☆60Updated last year
- This repository provides a multi-mode and multi-speaker expressive speech synthesis framework, including multi-attentive Tacotron, DurIAN…☆74Updated 2 years ago
- Pytorch implementation of Generalized End-to-End Loss for speaker verification☆83Updated 5 years ago
- ☆25Updated 5 months ago
- The VoxTube dataset official repository☆64Updated 11 months ago
- Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"☆17Updated 4 years ago
- Segment a given audio into utterances using a trained end-to-end ASR model.☆72Updated 4 years ago
- Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.☆88Updated 2 years ago
- Avocodo: Generative Adversarial Network for Artifact-free Vocoder☆115Updated 2 years ago
- Clustering-based methods for overlapping diarization☆74Updated last year
- Phoneme Boundary Detection using Learnable Segmental Features (ICASSP 2020)☆79Updated 3 years ago
- Transcripts and segmentation for the Blizzard 2013 audiobooks also known as the Lessac or Blizzard 2013 dataset.☆44Updated 5 years ago
- Pytorch version of Voice Activity Detection (VAD) based on Deep Learning (https://github.com/filippogiruzzi)☆26Updated 3 years ago
- ☆56Updated 2 years ago
- Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP…☆59Updated 4 years ago
- Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.☆68Updated 3 years ago
- This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repo…☆27Updated last year
- End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM☆39Updated 2 years ago
- Training code and trained checkpoints for ASGAN.☆62Updated last year
- A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.☆35Updated 3 years ago
- Implementation of the AlignTTS☆76Updated last year
- An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.☆53Updated 2 years ago