dawntcherian / Google-speech-to-text-python-websocket-server-using-microphone-stream
Python WebSocket server which converts input audio stream from microphone to text using Google speech to text
☆44Updated 2 years ago
Alternatives and similar repositories for Google-speech-to-text-python-websocket-server-using-microphone-stream:
Users that are interested in Google-speech-to-text-python-websocket-server-using-microphone-stream are comparing it to the libraries listed below
- [deprecated] Pretrained models for pyannote-audio 1.x☆72Updated 2 years ago
- Wrapper for pydub AudioSegment objects☆96Updated 2 years ago
- Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments☆102Updated 4 years ago
- ♂️♀️ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).☆80Updated 7 months ago
- A lightweight library to compute Diarization Error Rate (DER).☆59Updated last year
- Speaker diarization python system based on binary key speaker modelling☆61Updated 3 years ago
- DeepSpeech based forced alignment tool☆234Updated 4 years ago
- Code for Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networks☆63Updated 4 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆144Updated 8 months ago
- This repository is a collection of TTS Models in TFLite☆189Updated 3 years ago
- Speaker diarization scripts, based on AaltoASR☆190Updated 6 years ago
- Python library for handling audio datasets.☆136Updated last year
- Speaker diarization based on Kaldi x-vectors, tuned for 16k microphone data☆96Updated last year
- Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).☆274Updated last year
- GSoC'2021 | TensorFlow implementation of Wav2Vec2☆91Updated 3 years ago
- Speech noise reduction which was generated using existing post-production techniques implemented in Python☆176Updated 3 years ago
- Removes silence segments from wav audio files☆29Updated 4 years ago
- ☆65Updated last month
- ☆38Updated 11 months ago
- A Python library for measuring the acoustic features of speech (simultaneous speech, high entropy) compared to ones of native speech.☆240Updated 2 years ago
- Support tools for punctuation and boundary detection for ASR output.☆57Updated 2 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆92Updated this week
- Paper: https://arxiv.org/abs/1702.02285☆63Updated 6 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆100Updated last year
- Using speaker embedding for diarization in PyTorch☆18Updated 4 years ago
- End-to-end speech recognition using RNN Transducers in Tensorflow 2.0☆243Updated 3 years ago
- This will hold the data pipeline to convert raw audio data to speech which will act as input dataset for speech-to-text pipeline☆32Updated last year
- CTC Decoder implementation with python only. Also supports language model decoding using KenLM.☆37Updated 8 months ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Updated 2 years ago
- Diarization scoring tools.☆232Updated last year