shvmshukla / Speaker-Change-Detection
Speaker Diarization is the first step in many early audio processing and aims to solve the problem ”who spoke when”. It therefore relies on efficient use of temporal information from extracted audio features.
☆11Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for Speaker-Change-Detection
- The implementation of 'Watch, Listen, Attend and Spell’ (WLAS) network that learns to transcribe videos of mouth motion to character on p…☆11Updated 6 years ago
- Automatic Speech Recognition Dataset Generation☆36Updated 6 years ago
- Pytorch Code for S2IGAN☆41Updated 4 years ago
- Collection of models and extensions for deployment in PyTorch☆24Updated last year
- ☆31Updated 6 years ago
- Anonymous ICLR Submission☆14Updated 5 years ago
- Random regression forests for audio event detection☆9Updated 7 years ago
- A Text2Speech Engine built in Pytorch.☆11Updated 5 years ago
- Sound augmentation using Large-scale audio dataset (Audioset)☆44Updated 3 years ago
- Comprehensive Python library for speech and voice.☆33Updated last year
- ☆27Updated 5 years ago
- Project to learn about speech recognition - both Speaker Diarization and other Speech Recognition applications.☆47Updated 7 years ago
- Audio Classification using Image Classification☆49Updated 4 years ago
- PyTorch implementations of neural network models for keyword spotting☆11Updated 4 years ago
- Experiment with "one-shot learning" techniques to recognize a voice signature☆24Updated 4 years ago
- Demo page of our paper Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks With Guided Attention, ICASSP 201…☆15Updated 3 years ago
- ☆34Updated 5 years ago
- Implementation of "FastSpeech: Fast, Robust and Controllable Text to Speech"☆64Updated last year
- ☆19Updated 6 years ago
- Cochlear.ai submission for dcase2018 task2☆17Updated 6 years ago
- Emotion recognition of Speaker's Speech Data. Employ speaker detection classifiers for emotion recognition, a multiclass classification p…☆16Updated 9 years ago
- Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.☆36Updated 5 years ago
- Repository for Weak Label Learning for Audio Events - A closer look. Uses Audioset subset data provided for reproducibility.☆32Updated last year
- Run speaker recognition algorithms - Mirrored from https://gitlab.idiap.ch/bob/bob.bio.spear☆19Updated last year
- The Additive Margin MobileNet1D is a new light weight deep learning model for Speaker Recognition which is based on the MobileNetV2 archi…☆29Updated last year
- End to End Multiview Lip Reading☆10Updated 6 years ago
- ESPnet-TTS Audio Sample HP☆21Updated 5 years ago
- A punctuation transcription model to automatically add punctuation marks in an unpunctuated sentence or sentences.☆15Updated 4 years ago