solyarisoftware / CoquiSTTJs
Coqui STT offline engine API for NodeJs developers. With a simple HTTP ASR server.
β27Updated 3 years ago
Alternatives and similar repositories for CoquiSTTJs:
Users that are interested in CoquiSTTJs are comparing it to the libraries listed below
- On-device speaker diarization powered by deep learningβ39Updated this week
- π Coqui's machine learning job schedulerβ32Updated 3 years ago
- An even smaller speech recognizer / force alignerβ32Updated 3 months ago
- Coqui STT Model Manager - install, manage and try out Coqui STT models from the Model Zooβ25Updated last year
- Web app for keyword spotting using TensorflowJSβ70Updated 2 years ago
- Create modular, cross-browser, web audio pipelines to record and process audio in background threads. Comes with modules for VAD, ASR, reβ¦β47Updated last year
- Evaluate results from ASR/Speech-to-Text quicklyβ36Updated 3 years ago
- On-device voice activity detection (VAD) powered by deep learningβ202Updated this week
- Create an LJSpeech structured voice dataset on wave inputβ26Updated 5 months ago
- πΈSTT integration examplesβ126Updated 2 years ago
- Real-Time Whisper Voice Recognition with vosk model feedback.β112Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β60Updated last week
- Community framework for training tortoiseβ41Updated 2 years ago
- JavaScript deployment for Howl, the wake word detection modeling toolkit for Firefox Voiceβ10Updated 4 years ago
- Simple Diarization modelβ47Updated last year
- Vosk ASR offline engine API for NodeJs developers. With a simple HTTP ASR server.β45Updated 3 years ago
- Misc. tools/scripts that I made to use for tortoiseβ21Updated 7 months ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.β135Updated last year
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelinesβ94Updated 10 months ago
- Efficient approach to speaker diarization using voice characteristics extractionβ92Updated 11 months ago
- KATube is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. From a lβ¦β22Updated 7 months ago
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)β71Updated 9 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ145Updated 10 months ago
- Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.β45Updated last month
- Joint speech-language model - respond directly to audio!β30Updated 10 months ago
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speechβ¦β17Updated 2 years ago
- π« check your data, before you wreck your modelβ16Updated 2 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPβ¦β94Updated 5 months ago
- A free & open tool for transcribing audio interviews with offline ASR supportβ24Updated last year
- On-device speaker recognition engine powered by deep learningβ32Updated this week