guglielmocamporese / learning_invariances_in_speech_recognition
In this work I investigate the speech command task developing and analyzing deep learning models. The state of the art technology uses convolutional neural networks (CNN) because of their intrinsic nature of learning correlated represen- tations as is the speech. In particular I develop different CNNs trained on the Google Speech Command Dataset…
☆19Updated 6 years ago
Alternatives and similar repositories for learning_invariances_in_speech_recognition:
Users that are interested in learning_invariances_in_speech_recognition are comparing it to the libraries listed below
- Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP…☆59Updated 4 years ago
- The Additive Margin SincNet (AM-SincNet) is a new approach for speaker recognition problems which is based in the neural network architec…☆43Updated last year
- ☆24Updated 6 years ago
- Audio activity detector based on per-channel energy normalization (PCEN)☆29Updated 6 years ago
- The Additive Margin MobileNet1D is a new light weight deep learning model for Speaker Recognition which is based on the MobileNetV2 archi…☆30Updated last year
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆39Updated 4 years ago
- Tensor2tensor experiment with SpecAugment☆46Updated 5 years ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Updated 4 years ago
- A Kaldi/ESPnet based approach to perform automatic speech recognition on low resource languages☆9Updated 4 years ago
- ☆35Updated 3 weeks ago
- An example directory for running Multi-Task Learning training on Kaldi neural networks. In Kaldi-speak, this is an egs dir for nnet3 trai…☆54Updated 5 years ago
- Benchmark for sound event localization task of DCASE 2019 challenge☆76Updated 4 years ago
- Spectra extraction tutorials based on torch and torchaudio.☆41Updated last year
- fast SpecAugmentation code with numpy and scipy☆30Updated 5 years ago
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆61Updated 4 years ago
- PyTorch reimplementation of per-channel energy normalization for audio.☆98Updated 6 years ago
- ☆12Updated 3 years ago
- This code implements a basic MLP for speech recognition. The MLP is trained with pytorch, while feature extraction, alignments, and dec…☆38Updated 7 years ago
- Filtering and Noise Adding Tool☆29Updated 2 years ago
- Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" present…☆25Updated 2 years ago
- Code for the Paper Speech Recognition and Multi-Speaker Diarization of Long Conversations☆37Updated last year
- Development Toolkit for the VoxCeleb Speaker Recognition Challenge 2020☆42Updated 4 years ago
- Interspeech 2019 tutorial materials☆48Updated 5 years ago
- This python code performs an efficient speech reverberation starting from a dataset of close-talking speech signals and a collection of a…☆95Updated 4 years ago
- An implementation of RNN-Transducer loss in TF-2.0.☆45Updated 2 years ago
- GPU accelerated implementation of i-vector extractor training using PyTorch. Requires Kaldi for feature extraction and UBM training. An e…☆64Updated 5 years ago
- This repository describes our reproducible framework for assessing self-supervised representation learning from speech☆51Updated 3 years ago
- Code for the paper "Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks".☆13Updated 2 years ago
- PyTorch implementation of a self-attentive speaker embedding☆17Updated 5 years ago
- An online speech recognition extension toolkit of Kaldi☆56Updated 3 years ago