avryhof / speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
☆13Updated 2 years ago
Related projects: ⓘ
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Updated last year
- Evaluation of STT models for german language☆15Updated 2 years ago
- Command line utility for rustpotter, an open source wakeword spotter forged in rust☆10Updated 11 months ago
- Using YouTube to prepare a speech recognition dataset for any language☆10Updated 3 years ago
- ☆8Updated last year
- A free & open tool for transcribing audio interviews with offline ASR support☆24Updated 9 months ago
- wake word spotting with kaldi☆18Updated 3 years ago
- Speech to text library for Rhasspy using Kaldi☆14Updated 9 months ago
- ☆11Updated 9 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆27Updated last year
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆13Updated last year
- Faster Whisper ASR transcription with CTranslate2☆11Updated last week
- Coqui STT Model Manager - install, manage and try out Coqui STT models from the Model Zoo☆24Updated last year
- Multivoice: Enhance your foreign-language movie and TV show experience with personalized dubbed versions. Our project uses voice cloning …☆22Updated last year
- 📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.☆21Updated 5 years ago
- This is a TTS model based on VITS that can control the output speech emotion through natural language and control the speaker through ref…☆4Updated last month
- TTS Client for Coqui TTS server☆13Updated last year
- Lite Voice Terminal, an "offline smart speaker" solution powered by on-premise ASR server (vosk API / kaldi engine)☆14Updated 6 months ago
- a repository for trainabale tts multi speaker☆14Updated 2 years ago
- Python wrapper for phonetisaurus grapheme to phoneme tool☆11Updated 3 years ago
- ☆16Updated 3 years ago
- A simple, but performant framework for mapping speech directly to categories and intents.☆15Updated last month
- ☆22Updated 3 years ago
- How to create your own model for vosk☆63Updated 3 years ago
- Keyword Spotting (KWS) API wrapper for TFLite streaming models.☆11Updated 3 years ago
- BBB plugin for automatic subtitles in conference calls☆26Updated 2 years ago
- A handy dataset of noises for ASR☆19Updated 5 years ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 7 months ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆12Updated 3 years ago
- A corpus of speech from the Joe Rogan Experience podcast, consisting of 8.43 million words. It includes aligned TextGrids with phonetic a…☆16Updated 4 years ago