dawntcherian / Google-speech-to-text-python-websocket-server-using-microphone-streamLinks
Python WebSocket server which converts input audio stream from microphone to text using Google speech to text
☆47Updated 2 years ago
Alternatives and similar repositories for Google-speech-to-text-python-websocket-server-using-microphone-stream
Users that are interested in Google-speech-to-text-python-websocket-server-using-microphone-stream are comparing it to the libraries listed below
Sorting:
- DeepSpeech based forced alignment tool☆237Updated 4 years ago
- Speaker diarization scripts, based on AaltoASR☆190Updated 6 years ago
- ♂️♀️ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).☆85Updated 11 months ago
- Wrapper for pydub AudioSegment objects☆96Updated 2 years ago
- Speaker diarization python system based on binary key speaker modelling☆61Updated 3 years ago
- [deprecated] Pretrained models for pyannote-audio 1.x☆72Updated 2 years ago
- ESPnet Model Zoo☆251Updated last year
- Machine learning experiment to perform gender classification from raw audio.☆23Updated 6 years ago
- 📝An easy-to-use package to restore punctuation of the text.☆116Updated 2 years ago
- Deep Learning - one shot learning for speaker recognition using Filter Banks☆168Updated 11 months ago
- Code for Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networks☆65Updated 4 years ago
- A crash course for training speech recognition models using DeepSpeech.☆25Updated 4 years ago
- A lightweight library to compute Diarization Error Rate (DER).☆59Updated last year
- Some simple wrappers around eSpeak NG intended to make using this excellent TTS for waveform and IPA generation as convenient as possible…☆42Updated 8 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆149Updated last year
- Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments☆102Updated 5 years ago
- Advanced data structures for handling temporal segments with attached labels.☆113Updated 3 months ago
- Zero-shot Audio Classification using Whisper☆79Updated 2 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆100Updated 3 months ago
- Evaluate results from ASR/Speech-to-Text quickly☆37Updated 3 years ago
- Adapting your own Language Model for Kaldi☆63Updated 6 years ago
- An HTML interface for finetuning the sync map output from aeneas☆53Updated 2 years ago
- Deep Learning model for lexical stress detection in spoken English☆29Updated 5 years ago
- Speaker embedding (d-vector) trained with GE2E loss☆282Updated last year
- Onnx wrapper for espnet infrernce model☆162Updated 8 months ago
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech☆230Updated 2 years ago
- On-device voice activity detection (VAD) powered by deep learning☆217Updated this week
- Various speech datasets made available to the public☆121Updated 5 months ago
- ☆76Updated 3 years ago
- A list of publically available audio data that anyone can download for ASR or other speech activities☆209Updated 3 years ago