danijel3 / audio_guiLinks
Simple audio recorder that sends WAV from browser to server in Python (Flask).
☆31Updated 3 years ago
Alternatives and similar repositories for audio_gui
Users that are interested in audio_gui are comparing it to the libraries listed below
Sorting:
- Python library for handling audio datasets.☆138Updated 2 years ago
- Speaker diarization python system based on binary key speaker modelling☆60Updated 3 years ago
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech☆229Updated 3 years ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆66Updated 4 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆107Updated 2 years ago
- A collection of Audio and Speech pre-trained models.☆193Updated 5 years ago
- Python toolkit for speech processing☆72Updated this week
- End-to-End Speech Recognition using Neural Networks.☆35Updated last year
- Multilingual Grapheme to Phoneme☆50Updated 9 years ago
- A Python toolbox for speech features extraction☆165Updated 2 years ago
- Advanced data structures for handling temporal segments with attached labels.☆121Updated last month
- [deprecated] Pretrained models for pyannote-audio 1.x☆71Updated 3 years ago
- ♂️♀️ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).☆89Updated last year
- Text Independent Speaker Verification Using GE2E Loss☆84Updated 6 years ago
- Deep Neural Network for Speaker Count Estimation☆156Updated 5 years ago
- Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams (Interspeech'19)☆147Updated 2 years ago
- ☆76Updated 4 years ago
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram☆257Updated last year
- Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text☆245Updated 6 years ago
- Speaker Diarization is the problem of separating speakers in an audio. There could be any number of speakers and final result should stat…☆64Updated 4 years ago
- Speaker diarization based on Kaldi x-vectors, tuned for 16k microphone data☆95Updated 2 years ago
- 🏥 🎤 The largest clinical study in the world to collect voice data labeled with health information (N>6,000 participants, 48 utterances…☆31Updated 6 months ago
- Python library for audio augmentation☆84Updated 2 years ago
- Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.☆148Updated 3 years ago
- A Collection of Speech Corpus for ASR and TTS☆114Updated 8 years ago
- A lightweight library to compute Diarization Error Rate (DER).☆62Updated 2 years ago
- Segment a given audio into utterances using a trained end-to-end ASR model.☆74Updated 5 years ago
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆62Updated 5 years ago
- Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper☆141Updated 2 years ago
- Wav2Keyword is keyword spotting(KWS) based on Wav2Vec 2.0. This model shows state-of-the-art in Speech commands dataset V1 and V2.☆108Updated 2 years ago