hernanrazo / human-voice-detection
Binary classification problem that aims to classify human voices from audio recordings. Implemented using PyTorch and Librosa.
☆35Updated 3 years ago
Alternatives and similar repositories for human-voice-detection:
Users that are interested in human-voice-detection are comparing it to the libraries listed below
- Classify daily life events using audio data.☆51Updated 5 years ago
- Python library for audio augmentation☆83Updated last year
- Identify the emotion of multiple speakers in an Audio Segment☆168Updated 2 years ago
- Wav2Vec for speech recognition, classification, and audio classification☆262Updated 3 years ago
- [deprecated] Pretrained models for pyannote-audio 1.x☆72Updated 2 years ago
- ☆24Updated 6 years ago
- Voice Activity Detection (VAD) using deep learning.☆195Updated 5 years ago
- Deep Learning - one shot learning for speaker recognition using Filter Banks☆165Updated 9 months ago
- Speaker identification using voice MFCCs and GMM☆54Updated 4 years ago
- Machine Learning Sound Classifier☆135Updated 5 years ago
- Removing background noise in a sound file☆63Updated 5 years ago
- Speaker embedding (d-vector) trained with GE2E loss☆280Updated last year
- An in-depth analysis of audio classification on the RAVDESS dataset. Feature engineering, hyperparameter optimization, model evaluation, …☆75Updated 4 years ago
- Voice Activity Detection based on Deep Learning & TensorFlow☆361Updated 2 years ago
- Speech Emotion Recognition☆40Updated last year
- The Emotional Voices Database: Towards Controlling the Emotional Expressiveness in Voice Generation Systems☆263Updated last year
- Speech Emotion Recognition (SER) in real-time, using Deep Neural Networks (DNN) of Long Short Memory Term (LSTM).☆110Updated 3 years ago
- This is the GitHub page for publicly available emotional speech data.☆345Updated 3 years ago
- This project is about performing Speaker diarization for Hindi Language.☆49Updated 4 years ago
- Extract frequency, power, width and dissonance of formants from wav files☆25Updated 2 years ago
- Tools to create your own voice dataset for TTS training☆66Updated 4 years ago
- Speech Denoising with Deep Feature Losses☆185Updated 4 years ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆65Updated 3 years ago
- Python package for openSMILE☆274Updated 4 months ago
- 🏥 🎤 The largest clinical study in the world to collect voice data labeled with health information (N>6,000 participants, 48 utterances…☆28Updated 2 weeks ago
- Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper☆141Updated last year
- Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow☆128Updated 4 years ago
- Simple d-vector based Speaker Recognition (verification and identification) using Pytorch☆211Updated 4 years ago
- 🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.☆41Updated 3 years ago
- Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!☆348Updated 2 years ago