ReadAlongs / SoundSwallower
An even smaller speech recognizer / force aligner
☆32Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for SoundSwallower
- The EveryVoice TTS Toolkit - Text To Speech for your language☆21Updated this week
- A free & open tool for transcribing audio interviews with offline ASR support☆24Updated 10 months ago
- SEPIA server to support open-source speech recognition via WebSocket connection.☆120Updated this week
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆134Updated this week
- On-device speaker diarization powered by deep learning☆25Updated last month
- Create modular, cross-browser, web audio pipelines to record and process audio in background threads. Comes with modules for VAD, ASR, re…☆44Updated last year
- brainless concatenative text to speech☆11Updated 3 years ago
- Experiments to test different speech recognition systems for SEPIA Framework☆57Updated last year
- Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.☆32Updated last year
- On-device noise suppression powered by deep learning☆62Updated last month
- Uses ctypes and libespeak-ng to transform test into IPA phonemes☆20Updated last year
- TTS Client for Coqui TTS server☆13Updated last year
- Simple Diarization model☆42Updated 11 months ago
- Coqui STT Model Manager - install, manage and try out Coqui STT models from the Model Zoo☆24Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆98Updated last year
- 🐍 Coqui's machine learning job scheduler☆32Updated 3 years ago
- On-device voice activity detection (VAD) powered by deep learning☆173Updated 2 weeks ago
- ☆17Updated last year
- ☆77Updated 5 months ago
- C++ version of pyannote audio speaker diarizaiton pipeline☆18Updated 8 months ago
- Web app for keyword spotting using TensorflowJS☆69Updated last year
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆100Updated last year
- Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup☆58Updated 2 months ago
- Running the F5-TTS by ONNX Runtime☆25Updated this week
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆83Updated last month
- python code for converting among IPA, ARPABET, XSAMPA, Callhome, DISC, TIMIT, plus some lexical tones.☆29Updated 9 months ago
- 🎹 pyannote + 🗒 notebook = pyannotebook☆25Updated last year
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Updated last year
- Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IW…☆15Updated last year
- Audiobook alignment for Indigenous languages☆37Updated this week