Picovoice / octopus
On-device Speech-to-Index engine powered by deep learning
☆34Updated last month
Related projects ⓘ
Alternatives and complementary repositories for octopus
- On-device noise suppression powered by deep learning☆63Updated last month
- On-device voice activity detection (VAD) powered by deep learning☆179Updated last week
- TTS Client for Coqui TTS server☆13Updated last year
- 🐸TTS recipes for different datasets☆84Updated 2 years ago
- 🐍 Coqui's machine learning job scheduler☆32Updated 3 years ago
- Joint speech-language model - respond directly to audio!☆30Updated 6 months ago
- Web app for keyword spotting using TensorflowJS☆69Updated last year
- Simple text to phonemes converter for multiple languages☆20Updated 2 years ago
- A collection of pre-built speech synthesis settings used to convey emotion☆11Updated 5 years ago
- SEPIA server to support open-source speech recognition via WebSocket connection.☆120Updated 2 weeks ago
- Coqui STT Model Manager - install, manage and try out Coqui STT models from the Model Zoo☆24Updated last year
- JavaScript deployment for Howl, the wake word detection modeling toolkit for Firefox Voice☆10Updated 4 years ago
- Experiments with generating GPT-2 fanfiction on specified topics.☆11Updated 5 years ago
- Create an LJSpeech structured voice dataset on wave input☆21Updated last month
- Buildings block for voice-enabled applications in the browser☆33Updated last week
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆83Updated last month
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆99Updated last year
- 🫠 check your data, before you wreck your model☆16Updated 2 years ago
- A tokenizer, text cleaner, and phonemizer for many human languages.☆285Updated this week
- On-device speaker diarization powered by deep learning☆25Updated this week
- A repo with scripts to test and play around with Facebook's recent llama models! 🤗☆29Updated last year
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.☆25Updated last year
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆84Updated 6 months ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆100Updated last year
- C++ library for converting text to phonemes for Piper☆89Updated 8 months ago
- Zero-shot Audio Classification using Whisper☆74Updated last year
- An even smaller speech recognizer / force aligner☆32Updated last week
- Streamlit app to visualize and edit TTS datasets☆14Updated 2 years ago
- Speaker diarization service☆19Updated this week
- ☆77Updated 5 months ago