Speech-VINO / Smart-Media-Player
For our Smart Media Player (detecting time period(s) inside audio/video during which specific person(s) is/are speaking) project
☆18Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for Smart-Media-Player
- Using speaker embedding for diarization in PyTorch☆18Updated 4 years ago
- speaker_diarization done on toy dataset and tested on timit dataset☆8Updated 2 years ago
- Speaker Diarization using GRU in PyTorch☆11Updated 4 years ago
- The repository contains all the codes necessary for my project - Automatic Speech Recognition System in Hindi Language ( Project descript…☆28Updated 4 years ago
- A fully convolution-network for speech-to-text, built on pytorch.☆125Updated 4 years ago
- Augmented Audio Data Generator for 1D-Convolutional Neural Networks☆49Updated 3 years ago
- [deprecated] Pretrained models for pyannote-audio 1.x☆71Updated 2 years ago
- Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downs…☆31Updated 3 years ago
- Feature extraction of speech signal is the initial stage of any speech recognition system.☆91Updated 4 years ago
- For our speech emotion recognition project☆28Updated 3 years ago
- Audio data augmentation examples☆35Updated 6 years ago
- Easy-to-use Connectionnist Temporal Classification in Keras☆77Updated 3 years ago
- ☆45Updated 6 years ago
- SpeechYOLO Interspeech 2019☆42Updated 2 years ago
- Conv-LSTM-CTC speech recognition network (end-to-end), written in TensorFlow.☆71Updated 5 years ago
- How to do Real Time Trigger Word Detection with Keras | DLology☆162Updated 5 years ago
- A neural attention model for speech command recognition☆180Updated last year
- Built a deep neural network that functions as part of an end-to-end automatic speech recognition (ASR) pipeline.☆48Updated 5 years ago
- GSoC'2021 | TensorFlow implementation of Wav2Vec2☆88Updated 2 years ago
- Detecting emotions using MFCC features of human speech using Deep Learning☆125Updated 3 years ago
- Speaker diarization based on Kaldi x-vectors, tuned for 16k microphone data☆95Updated last year
- https://www.kaggle.com/c/tensorflow-speech-recognition-challenge/☆20Updated 6 years ago
- Repository containing experimentation platform on how to train, infer on wav2vec2 models.☆85Updated 2 years ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆110Updated 2 years ago
- The Additive Margin SincNet (AM-SincNet) is a new approach for speaker recognition problems which is based in the neural network architec…☆43Updated last year
- End-to-End Speech Recognition Using Tensorflow☆41Updated last year
- Urban sounds classification with Covnolutional Neural Networks☆36Updated 4 years ago
- ☆90Updated last year
- DeepSpeech, Speech To Text, ASR, Speech recognition, Keras, Tensorflow☆30Updated 6 years ago