goepfert / audio_features
Speech Recognition and Voice Activity Detection using a Convolutional Neural Network Architecture built with Tensorflow.js
☆12Updated 2 years ago
Related projects: ⓘ
- A java wrapper around the WebRTC Voice Activity Detection library☆55Updated 3 years ago
- Create modular, cross-browser, web audio pipelines to record and process audio in background threads. Comes with modules for VAD, ASR, re…☆44Updated last year
- Evaluate results from ASR/Speech-to-Text quickly☆35Updated 2 years ago
- On-device voice activity detection (VAD) powered by deep learning☆165Updated 2 weeks ago
- Tacotron text to speech in C++(synthesize only)☆75Updated 4 years ago
- Experiments to test different speech recognition systems for SEPIA Framework☆57Updated last year
- ☆42Updated 2 years ago
- Kaldi based speaker verification☆47Updated 6 years ago
- Web app for keyword spotting using TensorflowJS☆69Updated last year
- ☆30Updated 7 months ago
- Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.☆220Updated 4 years ago
- ☆43Updated 3 months ago
- text-independent speaker identification☆12Updated 6 years ago
- Aalto Automatic Speech Recognition tools☆85Updated 7 years ago
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram☆217Updated last month
- 🐸TTS recipes for different datasets☆84Updated 2 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆95Updated last year
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆100Updated last year
- PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean,…☆276Updated 3 years ago
- A personal toolkit for single/multi-channel speech recognition & enhancement & separation.☆139Updated last year
- Extract formant features such as frequency, power, energy, and bandwidth of formants at syllable or word level from audio sources in a we…☆26Updated last year
- Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three differen…☆194Updated 2 years ago
- Python server for communicating with Kaldi from the browser using WebRTC☆67Updated 11 months ago
- SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official code☆193Updated 2 years ago
- How to create your own model for vosk☆63Updated 3 years ago
- A Convolutional Neural Network based Voice Activity Detector for Smartphones☆68Updated 5 years ago
- This repository is a collection of TTS Models in TFLite☆186Updated 3 years ago
- Tacotron 2 - PyTorch implementation with faster-than-realtime inference☆30Updated 4 years ago
- Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.☆15Updated 3 years ago
- Port of the OpenFST library to Windows☆67Updated 4 months ago