assafmu / wav2letter_pytorch
An implementation of the Wav2Letter Speech-to-Text model using PyTorch.
☆14Updated last year
Alternatives and similar repositories for wav2letter_pytorch:
Users that are interested in wav2letter_pytorch are comparing it to the libraries listed below
- ☆32Updated 3 years ago
- The repository for Speech Recognition Israel meetup group. It is used to material collection and sharing.☆13Updated 4 years ago
- A collection of utilities for handling IPA phones.☆25Updated last year
- [DEPRECATED] Audio Module for fastai v2☆65Updated last year
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆36Updated last year
- A "Crowd-Built" continuously growing speech dataset with transcripts. The dataset contains multiple languages and is intended for anyone …☆41Updated 2 years ago
- ☆12Updated 3 years ago
- Code for the paper "Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks".☆13Updated 2 years ago
- Interspeech 2019 tutorial materials☆48Updated 5 years ago
- SiSEC MUS 2018 Submission System☆43Updated 5 years ago
- LogMMSE speech enhancement/noise reduction☆30Updated 4 years ago
- ☆20Updated 5 years ago
- ☆17Updated last year
- ☆56Updated 2 years ago
- A Hackable speech recognition library.☆25Updated 4 months ago
- Anonymous ICLR Submission☆14Updated 5 years ago
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Updated 4 years ago
- The Additive Margin MobileNet1D is a new light weight deep learning model for Speaker Recognition which is based on the MobileNetV2 archi…☆29Updated last year
- End-to-end diarization loss☆22Updated 3 years ago
- ☆16Updated 5 years ago
- Urban Sound Classification : striving towards a fair comparison☆17Updated 4 years ago
- Applying reinforcement learning to perform source separation.☆21Updated 4 years ago
- NIST SPH File reader (e.g. for TEDLIUM Corpus)☆25Updated 4 years ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Updated 2 years ago
- This repository provides data and code for "Vox Populi, Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription" paper.☆15Updated 3 years ago
- ☆11Updated 3 years ago
- Fast and differentiable hidden Markov model in C++☆17Updated 2 years ago
- An audio classification system for learning with out-of-distribution data☆33Updated 2 years ago
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.☆10Updated 4 years ago
- Keras implementation of musicnn, a set of pre-trained deep convolutional neural networks for music audio tagging☆25Updated 3 years ago