mosave / LVTerminal
Lite Voice Terminal, an "offline smart speaker" solution powered by on-premise ASR server (vosk API / kaldi engine)
☆15Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for LVTerminal
- a repository for trainabale tts multi speaker☆14Updated 2 years ago
- Evaluation of STT models for german language☆15Updated 2 years ago
- Generate samples using Piper to train wake word models☆22Updated 8 months ago
- Russian phonetical transcription☆9Updated 11 months ago
- MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline. (Accepted by IALP'2022)☆16Updated last year
- A simple, but performant framework for mapping speech directly to categories and intents.☆17Updated 3 months ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆18Updated 8 months ago
- ☆12Updated 4 months ago
- Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-gramma…☆20Updated 2 years ago
- ☆9Updated last month
- ☆11Updated 3 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- ☆8Updated last year
- proof of concept conversation orchestrator with a speech-language model☆14Updated last month
- [Early Alpha] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activit…☆21Updated 6 months ago
- Easy tool that splits given audio based on speaker.☆11Updated 10 months ago
- A handy dataset of noises for ASR☆19Updated 5 years ago
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Updated last year
- ☆16Updated 3 years ago
- Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectr…☆14Updated last year
- The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".☆18Updated 2 months ago
- Google Streaming KWS Arm Install Guide☆10Updated 3 years ago
- A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.☆17Updated 3 weeks ago
- ☆10Updated 2 years ago
- This is a TTS model based on VITS that can control the output speech emotion through natural language and control the speaker through ref…☆4Updated 3 months ago
- ☆9Updated last year
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆13Updated last month
- Speech to text library for Rhasspy using Kaldi☆14Updated 11 months ago
- STT VOSK REST API☆8Updated 5 months ago
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆13Updated last year