goepfert / audio_featuresLinks
Speech Recognition and Voice Activity Detection using a Convolutional Neural Network Architecture built with Tensorflow.js
☆13Updated 3 years ago
Alternatives and similar repositories for audio_features
Users that are interested in audio_features are comparing it to the libraries listed below
Sorting:
- Create modular, cross-browser, web audio pipelines to record and process audio in background threads. Comes with modules for VAD, ASR, re…☆47Updated 2 years ago
- Web app for keyword spotting using TensorflowJS☆73Updated 2 years ago
- 🐸TTS recipes for different datasets☆86Updated 3 years ago
- On-device voice activity detection (VAD) powered by deep learning☆228Updated last month
- Project repository for the work done in Triplet Entropy Loss: Improving The Generalization of Short Speech Language Identification Syst…☆13Updated 4 years ago
- A pipeline to isolate and transcribe one language in mixed-language speech☆19Updated 2 years ago
- A java wrapper around the WebRTC Voice Activity Detection library☆65Updated 4 years ago
- Fine-tune WhisperAI model to your language☆21Updated 2 years ago
- How to create your own model for vosk☆73Updated 4 years ago
- SEPIA server to support open-source speech recognition via WebSocket connection.☆131Updated 10 months ago
- Python server for communicating with Kaldi from the browser using WebRTC☆69Updated last year
- ☆43Updated last year
- Evaluate results from ASR/Speech-to-Text quickly☆38Updated 3 years ago
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Updated 2 years ago
- Simple Kaldi model server for chain (nnet3) models in online recognition mode directly from a local microphone☆36Updated 3 years ago
- 🐍 Coqui's machine learning job scheduler☆32Updated 4 years ago
- Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]☆27Updated 4 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆107Updated 2 years ago
- Extract formant features such as frequency, power, energy, and bandwidth of formants at syllable or word level from audio sources in a we…☆34Updated 10 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆118Updated 2 years ago
- Jupyter Notebooks for creating Speech datasets☆46Updated 6 years ago
- Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments☆102Updated 5 years ago
- 🐸STT integration examples☆128Updated 2 years ago
- ☆41Updated last year
- This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) wit…☆170Updated 4 years ago
- A complete speech segmentation system using Kaldi and x-vectors for voice activity detection (VAD) and speaker diarisation.☆31Updated last year
- A non-native English corpus for pronunciation scoring task☆151Updated last year
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆150Updated last year
- Extract frequency, power, width and dissonance of formants from wav files☆26Updated 3 years ago
- Goodness of Pronunciation using Kaldi on Epa-DB database☆35Updated last year