ReadAlongs / SoundSwallower
An even smaller speech recognizer / force aligner
☆32Updated 2 months ago
Alternatives and similar repositories for SoundSwallower:
Users that are interested in SoundSwallower are comparing it to the libraries listed below
- The EveryVoice TTS Toolkit - Text To Speech for your language☆24Updated 2 weeks ago
- Create modular, cross-browser, web audio pipelines to record and process audio in background threads. Comes with modules for VAD, ASR, re…☆47Updated last year
- A free & open tool for transcribing audio interviews with offline ASR support☆24Updated last year
- TTS Client for Coqui TTS server☆13Updated 2 years ago
- Coqui Inference Engine☆38Updated 3 years ago
- 🫠 check your data, before you wreck your model☆16Updated 2 years ago
- On-device speaker diarization powered by deep learning☆37Updated 2 weeks ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆148Updated last month
- Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.☆35Updated 2 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- Audiobook alignment for Indigenous languages☆38Updated this week
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆31Updated last year
- C++ version of pyannote audio speaker diarizaiton pipeline☆20Updated last year
- Python wrapper for phonetisaurus grapheme to phoneme tool☆12Updated 3 years ago
- Robust Speech Recognition via Large-Scale Weak Supervision☆30Updated last year
- 📈 A forced aligner intended for synchronization of narrated text☆90Updated 2 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆109Updated 2 years ago
- Prosodic Speech Segmentation with Transformers☆25Updated last year
- Uses ctypes and libespeak-ng to transform test into IPA phonemes☆20Updated last year
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆93Updated 4 months ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆19Updated 4 months ago
- ☆33Updated 8 months ago
- ☆80Updated 9 months ago
- KATube is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. From a l…☆22Updated 7 months ago
- Labeled data for homograph disambiguation☆56Updated last year
- This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core☆15Updated last year
- ☆21Updated last month
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆15Updated last year
- ☆12Updated 2 years ago