coqui-ai / STT-models
Open models for Coqui STT
β129Updated last year
Alternatives and similar repositories for STT-models:
Users that are interested in STT-models are comparing it to the libraries listed below
- πΈSTT integration examplesβ125Updated 2 years ago
- Official Implementation of StyleTTSβ418Updated last month
- A tokenizer, text cleaner, and phonemizer for many human languages.β303Updated 3 months ago
- On-device voice activity detection (VAD) powered by deep learningβ198Updated this week
- π€ Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillationβ246Updated last year
- πΈ - A general purpose model trainer, as flexible as it getsβ205Updated 11 months ago
- π Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. π§π₯π Advanced audio processing.β235Updated 8 months ago
- C++ library for converting text to phonemes for Piperβ105Updated 11 months ago
- Real-Time Whisper Voice Recognition with vosk model feedback.β109Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.β135Updated last year
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ145Updated 9 months ago
- β182Updated 2 years ago
- A live speech recognition using Facebooks wav2vec 2.0 model.β341Updated last year
- Metadata and versioning details for the Common Voice datasetβ145Updated 2 months ago
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.β160Updated 11 months ago
- β254Updated 11 months ago
- Performant and accurate speech recognition built on Pytorchβ252Updated 2 years ago
- [WIP] VoiceSmith makes training text to speech models easy.β224Updated 2 years ago
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated textsβ306Updated 3 months ago
- SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official codeβ199Updated 2 years ago
- Finetune VITS and MMS using HuggingFace's toolsβ134Updated 10 months ago
- Application of MB-iSTFT-VITS components to vits2_pytorchβ121Updated 3 months ago
- openvino version of openai/whisperβ165Updated last year
- ONNX Inference of Pyannote Segmentationβ80Updated last month
- β312Updated 7 months ago
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictionsβ234Updated last month
- β251Updated last year
- Desktop application for neural speech synthesis written in C++β213Updated last year
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusionβ171Updated 4 months ago
- QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversionβ236Updated last year