ishine / ContextNet
Tensorflow2 based implementation of ContextNet, an improved convolutional rnn-transducer-based architecture for end-to-end speech recognition using global context
☆17Updated 4 years ago
Alternatives and similar repositories for ContextNet:
Users that are interested in ContextNet are comparing it to the libraries listed below
- PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INT…☆38Updated 3 years ago
- [ICASSP2021] Data preperation scripts, training pipeline and baseline experiment results for the Interspeech 2020 Accented English Speech…☆55Updated 4 years ago
- End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM☆39Updated 2 years ago
- py-webrtcvad wrapper for trimming speech clips☆48Updated 2 years ago
- ☆29Updated 4 years ago
- Implementation of the paper "Confidence estimation for attention based sequence to sequence models for speech recognition"☆16Updated 3 years ago
- A implementation of Power Normalized Cepstral Coefficients: PNCC☆52Updated 5 years ago
- A Python library for computing the Mel-Cepstral Distance (Mel-Cepstral Distortion, MCD) between two inputs. This implementation is based …☆51Updated last week
- End-To-End Speaker Verification based on X-vector and Neural PLDA - A PyTorch implementation☆22Updated 3 years ago
- ☆60Updated 4 years ago
- ☆33Updated 4 years ago
- PyTorch implementation of RPNSD☆60Updated 10 months ago
- This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-cl…☆76Updated 2 years ago
- Constrained Permutation Invariant Training, Speech Separation☆47Updated 4 years ago
- STOI loss function in PyTorch☆91Updated 6 months ago
- an Audio-Visual Voice Activity Detection using Deep Learning☆48Updated 6 years ago
- streaming attention networks for end-to-end automatic speech recognition☆55Updated 4 years ago
- ☆59Updated 4 years ago
- Pronunciation-assisted Subword Modeling☆29Updated 5 years ago
- The code for the Interspeech paper "Speaker Embedding Extraction with Phonetic Information"☆45Updated 5 years ago
- An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.☆54Updated 2 years ago
- Code for reproducing experiments in "Domain-Adversarial Voice Activity Detection"☆23Updated 5 years ago
- Code for synchronising all CHiME-5 audio signals for use in CHiME-6☆18Updated 5 years ago
- A simple package for Guided source separation (GSS)☆121Updated 11 months ago
- ☆29Updated 3 years ago
- ☆50Updated 4 years ago
- Text frontend for ESPnet tts recipes☆31Updated 3 years ago
- Transformer-based online speech recognition system with TensorFlow 2☆26Updated 4 years ago
- Dynamic Chunk Streaming and Offline Conformer based on athena-team/Athena.☆44Updated 2 years ago
- The repo contains our code of ``Semantic Mask for Transformer based End-to-End Speech Recognition"☆38Updated 4 years ago