assafmu / wav2letter_pytorch
An implementation of the Wav2Letter Speech-to-Text model using PyTorch.
☆14Updated 2 years ago
Alternatives and similar repositories for wav2letter_pytorch:
Users that are interested in wav2letter_pytorch are comparing it to the libraries listed below
- ☆12Updated 3 years ago
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆14Updated 3 years ago
- Code for the paper "Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks".☆13Updated 2 years ago
- A collection of utilities for handling IPA phones.☆25Updated last year
- This repository provides data and code for "Vox Populi, Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription" paper.☆15Updated 3 years ago
- A 🔥 cookiecutter template for building Hugging Face Spaces☆11Updated 3 years ago
- ☆11Updated 3 years ago
- A simple implementation of the paper https://arxiv.org/pdf/1910.00716v1.pdf☆31Updated 3 years ago
- ☆17Updated last year
- Code for the paper: Unified Gradient Reweighting for Model Biasing with Applications to Source Separation☆14Updated 4 years ago
- Interspeech 2019 tutorial materials☆48Updated 5 years ago
- ☆17Updated 3 years ago
- Voice conversion training with 109 speakers with limited training samples☆35Updated 4 years ago
- ☆32Updated 3 years ago
- A PyTorch implementation of the paper: "AMSS-Net: Audio Manipulation on User-Specified Sources with Textual Queries" (ACM Multimedia 2021…☆21Updated 3 years ago
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Updated 4 years ago
- [DEPRECATED] Audio Module for fastai v2☆65Updated last year
- ☆29Updated 4 years ago
- A "Crowd-Built" continuously growing speech dataset with transcripts. The dataset contains multiple languages and is intended for anyone …☆41Updated 2 years ago
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.☆11Updated 4 years ago
- WaveNet implementation using tf.estimator☆21Updated last year
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆37Updated last year
- ☆31Updated 2 years ago
- Urban Sound Classification : striving towards a fair comparison☆17Updated 4 years ago
- Audio activity detector based on per-channel energy normalization (PCEN)☆29Updated 6 years ago
- ☆26Updated 3 years ago
- SiSEC MUS 2018 Submission System☆43Updated 5 years ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated last year
- follow NVIDIA, simplify it and support data parallel.☆13Updated 5 years ago
- Web page for ISCA Special Interest Group: Robust Speech Processing (RoSP)☆11Updated last year