castorini / howl-deployLinks
JavaScript deployment for Howl, the wake word detection modeling toolkit for Firefox Voice
β10Updated 5 years ago
Alternatives and similar repositories for howl-deploy
Users that are interested in howl-deploy are comparing it to the libraries listed below
Sorting:
- Web app for keyword spotting using TensorflowJSβ74Updated 3 years ago
- πΈTTS recipes for different datasetsβ86Updated 3 years ago
- π Coqui's machine learning job schedulerβ31Updated 4 years ago
- Buildings block for voice-enabled applications in the browserβ37Updated 8 months ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translationβ151Updated last year
- TurnGPT: a Transformer-based Language Model for Predicting Turn-taking in Spoken Dialogβ62Updated last year
- Joint speech-language model - respond directly to audio!β30Updated last year
- Zero-shot Audio Classification using Whisperβ79Updated 3 years ago
- π€ Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillationβ261Updated 2 months ago
- On-device voice activity detection (VAD) powered by deep learningβ241Updated last week
- Speech-to-text based on wav2letter built for transfer learningβ98Updated 3 years ago
- Jupyter Notebooks for creating Speech datasetsβ46Updated 6 years ago
- Repository for fine-tuning Transformers π€ based seq2seq speech models in JAX/Flax.β38Updated 2 years ago
- Speaker Diarization with Transformersβ69Updated 7 months ago
- The demo page of UniAudioβ34Updated last year
- SEPIA server to support open-source speech recognition via WebSocket connection.β135Updated last year
- A lightweight end-of-utterance detection model fine-tuned on SmolLM2-135M, optimized for Raspberry Pi and low-power devices.β43Updated 2 months ago
- Open TTS models, built for streaming on the edgeβ44Updated 9 months ago
- The Gridspace-Stanford Harper Valley speech dataset. Created in support of CS224S.β49Updated 4 years ago
- β76Updated 4 years ago
- πΈSTT integration examplesβ130Updated 3 years ago
- A phoneme-allophone database for many languagesβ53Updated 5 years ago
- Google's SoundStorm: Efficient Parallel Audio Generationβ131Updated 2 years ago
- SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Modelβ107Updated 4 years ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecodeβ111Updated 3 years ago
- Datasets for turn-taking researchβ17Updated 2 years ago
- BotSIM - a data-efficient end-to-end Bot SIMulation toolkit for evaluation, diagnosis, and improvement of commercial chatbotsβ116Updated 8 months ago
- A model that predicts the punctuation of English, Italian, French and German texts.β83Updated 2 years ago
- Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environmentsβ103Updated 5 years ago
- Putting flows on top of neural transducers for better TTSβ64Updated last month