guglielmocamporese / learning_invariances_in_speech_recognition
In this work I investigate the speech command task developing and analyzing deep learning models. The state of the art technology uses convolutional neural networks (CNN) because of their intrinsic nature of learning correlated represen- tations as is the speech. In particular I develop different CNNs trained on the Google Speech Command Dataset…
☆19Updated 6 years ago
Alternatives and similar repositories for learning_invariances_in_speech_recognition:
Users that are interested in learning_invariances_in_speech_recognition are comparing it to the libraries listed below
- fast SpecAugmentation code with numpy and scipy☆30Updated 5 years ago
- Python library for audio augmentation☆83Updated last year
- This code implements a basic MLP for speech recognition. The MLP is trained with pytorch, while feature extraction, alignments, and dec…☆38Updated 7 years ago
- Audio activity detector based on per-channel energy normalization (PCEN)☆29Updated 6 years ago
- ☆24Updated 6 years ago
- Benchmark for sound event localization task of DCASE 2019 challenge☆76Updated 4 years ago
- Tensor2tensor experiment with SpecAugment☆46Updated 5 years ago
- The Additive Margin SincNet (AM-SincNet) is a new approach for speaker recognition problems which is based in the neural network architec…☆43Updated last year
- Audio data augmentation examples☆34Updated 6 years ago
- The Additive Margin MobileNet1D is a new light weight deep learning model for Speaker Recognition which is based on the MobileNetV2 archi…☆30Updated last year
- Feature extractor for DL speech processing.☆65Updated 2 years ago
- Python toolkit for speech processing☆68Updated this week
- Visualization toolbox for Sound Event Detection☆119Updated last year
- Pytorch based phoneme recognition (TIMIT phoneme classification)☆34Updated 6 years ago
- ☆60Updated 4 years ago
- Python implementation of pre-processing for End-to-End speech recognition☆69Updated 7 years ago
- Speech command recognition with capsule network & various NNs / KWS on Google Speech Command Dataset.☆25Updated 6 years ago
- PyTorch implementation of a self-attentive speaker embedding☆17Updated 5 years ago
- Code for Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networks☆64Updated 4 years ago
- Multiple Instance Learning for Sound Event Detection☆34Updated 6 years ago
- Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP…☆59Updated 4 years ago
- This python code performs an efficient speech reverberation starting from a dataset of close-talking speech signals and a collection of a…☆95Updated 4 years ago
- Source code of the DCASE 2020 SELD submission "Audio Event Detection and Localization with Multitask Regression Network"☆16Updated 4 years ago
- ☆9Updated 4 years ago
- This repository contains the code and supplementary result for the paper "Unpaired Speech Enhancement by Acoustic and Adversarial Supervi…☆28Updated 5 years ago
- This repository provides information on how to use the SINS database along with some example code. The SINS Dataset is composed of conti…☆23Updated 2 years ago
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆61Updated 4 years ago
- 📊 Easily apply audio-related machine learning models trained on the AudioSet dataset (527+ models/classes).☆29Updated 9 months ago
- A PyTorch implementation of Tacotron2, an end-to-end text-to-speech(TTS) system described in "Natural TTS Synthesis By Conditioning Waven…☆52Updated 6 years ago
- Chainer implementation of between-class learning for sound recognition https://arxiv.org/abs/1711.10282☆93Updated 7 years ago