aminul-huq / Speech-Command-Classification

Speech command classification on Speech-Command v0.02 dataset using PyTorch and torchaudio. In this example, three models have been trained using the raw signal waveforms, MFCC features and MelSpectogram features.

☆9

Alternatives and similar repositories for Speech-Command-Classification:

Users that are interested in Speech-Command-Classification are comparing it to the libraries listed below

msplabresearch / MSP-Podcast_Challenge
MSP-Podcast Challenge Baseline Code
☆21Updated 10 months ago
denfed / wave-spec-fusion
Code for the submitted 2021 DCASE Workshop paper: "Waveforms and Spectrograms: Enhancing Acoustic Scene Classification Using Multimodal F…
☆14Updated 3 years ago
msplabresearch / MSP-Podcast_Challenge_IS2025
MSP-Podcast Challenge Baseline Code for Interspeech 2025
☆23Updated 4 months ago
michen00 / multilingual_speech_valence_classification_datasets
Multilingual datasets with raw audio for speech emotion recognition
☆25Updated 3 years ago
iiscleap / NISP-Dataset
☆29Updated 2 years ago
l3das / L3DAS22
☆49Updated 2 years ago
Neclow / SERAB
SERAB: a multi-lingual benchmark for speech emotion recognition
☆28Updated 2 years ago
HappyColor / Vesper
A Compact and Effective Pretrained Model for Speech Emotion Recognition
☆38Updated 9 months ago
Kyoto-University-Speech-and-Audio / feng-asr-ser
☆11Updated 4 years ago
Lhx94As / PHO-LID
PHO-LID: A Unified Model to Incorporate Acoustic-Phonetic and Phonotactic Information for Language Identification
☆21Updated last year
re9ulus / BC-ResNet
BC-ResNet for Keyword Spotting
☆38Updated 3 years ago
bagustris / deep-mlp-ser
Repository for my paper: Deep Multilayer Perceptrons for Dimensional Speech Emotion Recognition
☆11Updated last year
qinxiaoyi / Simple-Attention-Module-based-Speaker-Verification-with-Iterative-Noisy-Label-Detection
☆13Updated 2 years ago
dianwen-ng / Keyword-Spotting-ConvMixer
☆31Updated 2 years ago
felixchenfy / Speech-Commands-Classification-by-LSTM-PyTorch
Classification of 11 types of audio clips using MFCCs features and LSTM. Pretrained on Speech Command Dataset with intensive data augment…
☆42Updated 2 years ago
yinkalario / EIN-SELD
An Improved Event-Independent Network for Polyphonic Sound Event Localization and Detection
☆71Updated 3 years ago
mashrurmorshed / Torch-KWT
Unofficial PyTorch implementation of "Keyword Transformer: A Self-Attention Model for Keyword Spotting", Berg et al. 2021.
☆37Updated 2 years ago
lessonxmk / Optimized_attention_for_SER
☆41Updated 4 years ago
habla-liaa / ser-with-w2v2
Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'
☆131Updated 3 months ago
TideDancer / interspeech21_emotion
☆108Updated 2 years ago
qiuqiangkong / sound_event_detection_dcase2017_task4
☆53Updated 4 years ago
Emotional-Text-to-Speech / pytorch-dc-tts
Text to Speech with PyTorch (English and Mongolian)
☆12Updated 4 years ago
michen00 / unified_multilingual_dataset_of_emotional_human_utterances
A unified dataset of multilingual emotional human utterances
☆25Updated 3 years ago
LetianLee / Speech-Emotion-Recognition
An implementation of Speech Emotion Recognition, based on HuBERT model, training with PyTorch and HuggingFace framework, and fine-tuning …
☆33Updated 2 years ago
TowerYsable / speech_enhancement_awesome
☆21Updated 3 years ago
YoungJay0612 / Speech-Simulation-Tools
语音增强领域的相关数据仿真工具和方法汇总--持续更新
☆39Updated 9 months ago
JunyiPeng00 / SLT22_MultiHead-Factorized-Attentive-Pooling
An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification
☆20Updated 7 months ago
khhungg / BSSE-SE
Boosting Self-Supervised Embeddings for Speech Enhancement
☆47Updated 2 years ago
Qualcomm-AI-research / bcresnet
☆60Updated last year
joaomonteirof / e2e_antispoofing
e2e_antispoofing
☆20Updated 3 years ago