goepfert / audio_featuresLinks
Speech Recognition and Voice Activity Detection using a Convolutional Neural Network Architecture built with Tensorflow.js
☆13Updated 4 years ago
Alternatives and similar repositories for audio_features
Users that are interested in audio_features are comparing it to the libraries listed below
Sorting:
- A java wrapper around the WebRTC Voice Activity Detection library☆66Updated 4 years ago
- Web app for keyword spotting using TensorflowJS☆74Updated 3 years ago
- Extract formant features such as frequency, power, energy, and bandwidth of formants at syllable or word level from audio sources in a we…☆36Updated last year
- ☆43Updated last year
- Integration of Fastspeech Text to Mel generation and fast Vocoder Squeezewave☆20Updated 2 years ago
- Fine-tune WhisperAI model to your language☆21Updated 2 years ago
- On-device voice activity detection (VAD) powered by deep learning☆241Updated last week
- Jupyter Notebooks for creating Speech datasets☆46Updated 6 years ago
- Create modular, cross-browser, web audio pipelines to record and process audio in background threads. Comes with modules for VAD, ASR, re…☆46Updated 2 years ago
- Python server for communicating with Kaldi from the browser using WebRTC☆69Updated 2 years ago
- Speaker diarization scripts, based on AaltoASR☆191Updated 7 years ago
- speaker diarization system using an LSTM☆50Updated 3 years ago
- Tools for speech processing, keyword spotting☆17Updated 5 years ago
- Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]☆26Updated 4 years ago
- SailAlign is an open-source software toolkit for robust long speech-text alignment implementing an adaptive, iterative speech recognition…☆99Updated 3 years ago
- 🐸TTS recipes for different datasets☆86Updated 3 years ago
- Speaker diarization python system based on binary key speaker modelling☆60Updated 3 years ago
- Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.☆16Updated 5 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆153Updated last year
- Speaker Diarization is the problem of separating speakers in an audio. There could be any number of speakers and final result should stat…☆64Updated 5 years ago
- flask+tornado based NVIDIA tacotron2+waveglow tts web app☆29Updated 2 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆107Updated 2 years ago
- STT Service based on Kaldi ASR☆15Updated 7 years ago
- ☆45Updated last year
- SEPIA server to support open-source speech recognition via WebSocket connection.☆134Updated last year
- A Python toolbox for speech features extraction☆166Updated 2 years ago
- Automatic Speech Recognition Dataset Generation☆37Updated 7 years ago
- Python implementation of pre-processing for End-to-End speech recognition☆69Updated 7 years ago
- Forced Alignments for Common Voice☆32Updated 5 years ago