brihijoshi / vanilla-stft-mfcc
A Python implementation of STFT and MFCC audio features from scratch
☆16Updated 4 years ago
Alternatives and similar repositories for vanilla-stft-mfcc:
Users that are interested in vanilla-stft-mfcc are comparing it to the libraries listed below
- Libri-CSS: dataset and evaluation pipeline☆141Updated 2 years ago
- This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1☆109Updated 5 years ago
- Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'☆127Updated 3 weeks ago
- This repository contains PyTorch implementation of 4 different models for classification of emotions of the speech.☆197Updated 2 years ago
- Variational Bayes HMM over x-vectors diarization☆261Updated last year
- Implementation of the paper "Spoken Language Recognition using X-vectors" in Pytorch☆106Updated 4 years ago
- Multilingual datasets with raw audio for speech emotion recognition☆22Updated 3 years ago
- Deep-Learning-Based Audio-Visual Speech Enhancement and Separation☆205Updated last year
- An awesome spoken LID repository. (Working in progress☆98Updated 9 months ago
- Research code for the paper "Fine-tuning wav2vec2 for speaker recognition" found at https://arxiv.org/abs/2109.15053☆144Updated 2 years ago
- A collection of datasets for the purpose of emotion recognition/detection in speech.☆308Updated 4 months ago
- Implementation of Neural PLDA (NPLDA) model (A discriminative backend for Speaker Verification)☆97Updated 4 years ago
- Speaker embedding (d-vector) trained with GE2E loss☆274Updated last year
- Speech Separation☆60Updated 10 months ago
- Voice based gender recognition using Mel-frequency cepstrum coefficients (MFCC) and Gaussian mixture models (GMM)☆206Updated last year
- Dataset and baseline code for the VocalSound dataset (ICASSP2022).☆127Updated 2 years ago
- Speaker diarization based on Kaldi x-vectors, tuned for 16k microphone data☆96Updated last year
- Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised …☆131Updated last year
- sha256 C++ concurrency GMM声纹识别☆18Updated 6 years ago
- Voice Activity Detection (VAD) using deep learning.☆193Updated 5 years ago
- Implementation of the paper "wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations" in Pytorch.☆39Updated last year
- A statistical model-based Voice Activity Detection☆190Updated 6 years ago
- The codebase for Data-driven general-purpose voice activity detection.☆93Updated last year
- Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper☆142Updated last year
- Automatic speech emotion recognition based on transfer learning from spectrograms using ResNET☆21Updated 2 years ago
- Official repository of our paper: https://arxiv.org/abs/2010.15366☆61Updated 3 years ago
- An open source dataset for source separation☆395Updated 11 months ago
- Spot the conversation: speaker diarisation in the wild☆132Updated 2 years ago
- This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-cl…☆75Updated 2 years ago
- Data preparation for separation☆76Updated 3 years ago