danijel3 / audio_guiLinks
Simple audio recorder that sends WAV from browser to server in Python (Flask).
☆31Updated 3 years ago
Alternatives and similar repositories for audio_gui
Users that are interested in audio_gui are comparing it to the libraries listed below
Sorting:
- End-to-End Speech Recognition using Neural Networks.☆35Updated last year
- Speaker diarization python system based on binary key speaker modelling☆60Updated 3 years ago
- Advanced data structures for handling temporal segments with attached labels.☆124Updated 3 months ago
- [deprecated] Pretrained models for pyannote-audio 1.x☆71Updated 3 years ago
- Python toolkit for speech processing☆72Updated 3 weeks ago
- Python library for handling audio datasets.☆138Updated 2 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆107Updated 2 years ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆68Updated 4 years ago
- ☆76Updated 4 years ago
- Deep Neural Network for Speaker Count Estimation☆157Updated 5 years ago
- ☆37Updated last month
- Pytorch implementation of Deepmind's WaveRNN model☆123Updated 6 years ago
- Python library for audio augmentation☆85Updated 2 years ago
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech☆232Updated 3 years ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆80Updated 2 years ago
- Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.☆148Updated 3 years ago
- Scripts to simplify data prepping for Mozilla DeepSpeech.☆14Updated 6 years ago
- Speaker diarization based on Kaldi x-vectors, tuned for 16k microphone data☆96Updated 2 years ago
- Articulatory features estimation using Listen Attend and Spell architecture.☆32Updated 5 years ago
- Speaker Diarization is the problem of separating speakers in an audio. There could be any number of speakers and final result should stat…☆64Updated 5 years ago
- ♂️♀️ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).☆90Updated last year
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆62Updated 5 years ago
- Machine learning experiment to perform gender classification from raw audio.☆23Updated 7 years ago
- Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper☆141Updated 2 years ago
- Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text☆246Updated 6 years ago
- A list of publically available audio data that anyone can download for ASR or other speech activities☆232Updated 4 years ago
- Repository containing experimentation platform on how to train, infer on wav2vec2 models.☆87Updated 3 years ago
- A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.☆136Updated 5 years ago
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…☆151Updated 7 months ago
- LogMMSE speech enhancement/noise reduction☆90Updated 5 years ago