guglielmocamporese / learning_invariances_in_speech_recognition
In this work I investigate the speech command task developing and analyzing deep learning models. The state of the art technology uses convolutional neural networks (CNN) because of their intrinsic nature of learning correlated represen- tations as is the speech. In particular I develop different CNNs trained on the Google Speech Command Dataset…
☆19Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for learning_invariances_in_speech_recognition
- The Additive Margin SincNet (AM-SincNet) is a new approach for speaker recognition problems which is based in the neural network architec…☆43Updated last year
- Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP…☆59Updated 4 years ago
- Source code of the DCASE 2020 SELD submission "Audio Event Detection and Localization with Multitask Regression Network"☆16Updated 4 years ago
- Interspeech 2019 tutorial materials☆48Updated 5 years ago
- Tensor2tensor experiment with SpecAugment☆47Updated 5 years ago
- Python toolkit for speech processing☆67Updated this week
- fast SpecAugmentation code with numpy and scipy☆30Updated 5 years ago
- ☆17Updated 5 years ago
- An example directory for running Multi-Task Learning training on Kaldi neural networks. In Kaldi-speak, this is an egs dir for nnet3 trai…☆54Updated 4 years ago
- This code implements a basic MLP for speech recognition. The MLP is trained with pytorch, while feature extraction, alignments, and dec…☆37Updated 6 years ago
- Speech command recognition with capsule network & various NNs / KWS on Google Speech Command Dataset.☆26Updated 5 years ago
- ☆22Updated 7 years ago
- VoxSRC Challenge☆31Updated 5 years ago
- ☆32Updated 2 months ago
- ☆13Updated 6 years ago
- Experiments on speech recognition robustness to accents and dialects☆12Updated 5 years ago
- This repository describes our reproducible framework for assessing self-supervised representation learning from speech☆51Updated 3 years ago
- Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" present…☆24Updated 2 years ago
- Voxceleb1 i-vector based speaker recognition system☆43Updated 6 years ago
- WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation, and training audio classification models wi…☆89Updated 3 years ago
- Constrained Permutation Invariant Training, Speech Separation☆43Updated 3 years ago
- A implementation of Power Normalized Cepstral Coefficients: PNCC☆50Updated 5 years ago
- Baseline kaldi script for UA-SPEECH corpus☆29Updated last month
- Lattice combination algorithm to combine inaccurate transcripts with hypothesis lattices☆16Updated 8 months ago
- implementation of "EFFICIENT KEYWORD SPOTTING USING DILATED CONVOLUTIONS AND GATING"☆35Updated 4 years ago
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆57Updated 4 years ago
- Keras-based python framework to compute phonological posterior probabilities from audio files☆37Updated last year
- Code for AccentDB.☆19Updated 3 years ago
- ABX and kaldi experiments on speech corpora made easy☆31Updated last month
- ☆53Updated 3 years ago