Honghe / demo_fastapi_websocket
Demo FastAPI WebSocket Audio
☆40Updated 4 years ago
Alternatives and similar repositories for demo_fastapi_websocket:
Users that are interested in demo_fastapi_websocket are comparing it to the libraries listed below
- A streaming whisper server for on-prem transcription☆20Updated 8 months ago
- Speaker diarization model☆27Updated 2 years ago
- A curated list of awesome voice activity detection☆50Updated 5 months ago
- Real-Time Whisper Voice Recognition with vosk model feedback.☆112Updated last year
- Real time web based Speech-to-Text app with Streamlit☆245Updated last year
- ☆55Updated 2 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆148Updated last year
- ☆34Updated 10 months ago
- Speaker diarization service☆21Updated 3 weeks ago
- Video chat apps with computer vision filters built on top of Streamlit☆50Updated last year
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)☆80Updated 11 months ago
- Transcription and diarization (speaker identification)☆34Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆83Updated last year
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆94Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated last month
- A library for real-time Speech to Text (STT), and Text to Speech (TTS) capability☆40Updated last year
- Building a Deep learning model that predicts the gender of a speaker using TensorFlow 2☆124Updated 2 years ago
- Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python☆18Updated last year
- Tunable pipelines☆33Updated 2 months ago
- ☆35Updated 4 years ago
- Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.☆48Updated 3 weeks ago
- Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for the Whisper-medium, designed to enhance performance on mul…☆15Updated 3 months ago
- FastAPI on Cloud Run v2: Terraform setup, GitHub Actions CI, Cloud Build triggers, Secret Manager integration☆18Updated last year
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆15Updated last year
- Mirror of hf.co/pyannote/speaker-diarization-3.1☆20Updated last year
- 💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.☆210Updated 6 months ago
- A python library to find differences between audio and transcriptions☆20Updated last year
- Efficient approach to speaker diarization using voice characteristics extraction☆94Updated last year
- ☆50Updated this week
- Reproducible experimental protocols for multimedia (audio, video, text) database☆100Updated 2 months ago