aminul-huq / Speech-Command-Classification
Speech command classification on Speech-Command v0.02 dataset using PyTorch and torchaudio. In this example, three models have been trained using the raw signal waveforms, MFCC features and MelSpectogram features.
☆9Updated 2 years ago
Alternatives and similar repositories for Speech-Command-Classification:
Users that are interested in Speech-Command-Classification are comparing it to the libraries listed below
- Matlab tools for pathological voice analysis☆13Updated last year
- ☆36Updated 2 years ago
- Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition☆72Updated 2 years ago
- Repository for my paper: Deep Multilayer Perceptrons for Dimensional Speech Emotion Recognition☆11Updated last year
- Automatic speech emotion recognition based on transfer learning from spectrograms using ResNET☆21Updated 3 years ago
- Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'☆129Updated 2 months ago
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Updated 2 years ago
- ☆49Updated 2 years ago
- 3-D Convolutional Recurrent Neural Networks With Attention Model for Speech Emotion Recognition.☆38Updated 4 years ago
- An Improved Event-Independent Network for Polyphonic Sound Event Localization and Detection☆70Updated 3 years ago
- ☆41Updated 4 years ago
- Human emotions are one of the strongest ways of communication. Even if a person doesn’t understand a language, he or she can very well u…☆24Updated 3 years ago
- Speaker verification using ResnetSE (EER=0.0093) and ECAPA-TDNN☆91Updated 3 years ago
- Implementation of Hybrid CTC/Attention Architecture for End-to-End Speech Recognition in pure python and PyTorch☆24Updated 8 months ago
- Paderborn Sound Event Detection☆73Updated last year
- Submission to the HEAR2021 Challenge☆16Updated 3 years ago
- ☆17Updated 3 years ago
- End-To-End Speaker Verification based on X-vector and Neural PLDA - A PyTorch implementation☆22Updated 3 years ago
- 分别在VCTK、AISHELL1 和 VoxCeleb1 三个标准公开数据集上对三种端到端声纹模型框架(Deep Speaker, RawNet, GE2E)进行实验比较。☆22Updated 4 years ago
- PHO-LID: A Unified Model to Incorporate Acoustic-Phonetic and Phonotactic Information for Language Identification☆21Updated last year
- ☆32Updated 2 years ago
- Room impulse response simulator using python☆96Updated 4 years ago
- wsj0-{2, 3, 4, 5} mix generation scripts, in Python.☆58Updated 4 years ago
- Speech Emotion Recognition using transfer learning with wav2vec on IEMOCAP.☆15Updated 3 years ago
- ☆18Updated 2 years ago
- Baseline method for sound event localization task of DCASE 2023 challenge☆49Updated 2 years ago
- ☆79Updated 7 months ago
- ☆53Updated 4 years ago
- PyTorch implementation of the LEAF audio frontend☆69Updated 2 years ago
- Multilingual datasets with raw audio for speech emotion recognition☆23Updated 3 years ago