guglielmocamporese / learning_invariances_in_speech_recognition
In this work I investigate the speech command task developing and analyzing deep learning models. The state of the art technology uses convolutional neural networks (CNN) because of their intrinsic nature of learning correlated represen- tations as is the speech. In particular I develop different CNNs trained on the Google Speech Command Dataset…
☆19Updated 6 years ago
Alternatives and similar repositories for learning_invariances_in_speech_recognition:
Users that are interested in learning_invariances_in_speech_recognition are comparing it to the libraries listed below
- The Additive Margin SincNet (AM-SincNet) is a new approach for speaker recognition problems which is based in the neural network architec…☆43Updated last year
- Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP…☆59Updated 4 years ago
- Feature extractor for DL speech processing.☆65Updated 2 years ago
- fast SpecAugmentation code with numpy and scipy☆30Updated 5 years ago
- Baseline kaldi script for UA-SPEECH corpus☆30Updated 5 months ago
- This python code performs an efficient speech reverberation starting from a dataset of close-talking speech signals and a collection of a…☆95Updated 4 years ago
- Speech command recognition with capsule network & various NNs / KWS on Google Speech Command Dataset.☆25Updated 6 years ago
- Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" present…☆25Updated 2 years ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Updated 4 years ago
- Code for the paper "Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks".☆13Updated 2 years ago
- A Kaldi recipe for training automatic speech recognition systems on the Torgo corpus of dysarthric speech☆17Updated last year
- ☆29Updated 4 years ago
- Python toolkit for speech processing☆68Updated this week
- Keras-based python framework to compute phonological posterior probabilities from audio files☆41Updated 2 years ago
- ☆13Updated 6 years ago
- Interspeech 2019 tutorial materials☆48Updated 5 years ago
- WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation, and training audio classification models wi…☆89Updated 3 years ago
- Code for AccentDB.☆20Updated 3 years ago
- A Kaldi/ESPnet based approach to perform automatic speech recognition on low resource languages☆9Updated 4 years ago
- 📊 Easily apply audio-related machine learning models trained on the AudioSet dataset (527+ models/classes).☆29Updated 9 months ago
- VoxSRC Challenge☆31Updated 5 years ago
- This repository describes our reproducible framework for assessing self-supervised representation learning from speech☆51Updated 3 years ago
- Implementation for paper "iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric L…☆54Updated last year
- ☆47Updated 4 years ago
- A implementation of Power Normalized Cepstral Coefficients: PNCC☆52Updated 5 years ago
- Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"☆17Updated 4 years ago
- Tensor2tensor experiment with SpecAugment☆46Updated 5 years ago
- implementation of "EFFICIENT KEYWORD SPOTTING USING DILATED CONVOLUTIONS AND GATING"☆36Updated 5 years ago
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆39Updated 4 years ago
- GPU accelerated implementation of i-vector extractor training using PyTorch. Requires Kaldi for feature extraction and UBM training. An e…☆64Updated 5 years ago