riteshhere / Speaker_diarizationLinks
Speech Diarization for scrum automation
☆111Updated 2 years ago
Alternatives and similar repositories for Speaker_diarization
Users that are interested in Speaker_diarization are comparing it to the libraries listed below
Sorting:
- Open source inference code for Rev's model☆435Updated 8 months ago
- A lightweight end-to-end text-to-speech model☆125Updated 10 months ago
- ☆167Updated last year
- Have a natural voice conversation with an LLM☆259Updated 2 months ago
- A toolkit for speaker diarization.☆348Updated 2 weeks ago
- Live-Transcription (STT) with Whisper PoC☆202Updated last year
- We Speech Transcript based on LLM, in 300 lines of code.☆181Updated 6 months ago
- ☆175Updated 2 years ago
- Voice Transformation for Videos. 🎤👄🎬☆259Updated 6 months ago
- ☆337Updated 9 months ago
- ☆355Updated last year
- Whisper realtime streaming for long speech-to-text transcription and translation☆121Updated last year
- ☆472Updated 7 months ago
- 🎧 Pod-Helper: Real-time audio transcription and repair on consumer hardware☆77Updated last year
- Streaming ASR and TTS based on FastAPI+ sherpa-onnx☆174Updated last month
- RealSI: Open Benchmark for Simultaneous Interpretation in Real-world Scenarios☆76Updated 5 months ago
- Dynamic Voice Actor Assignment and Emotional Narration for Realistic Story Play☆45Updated 8 months ago
- Nendo is an open source platform for AI-driven audio management, intelligence, and generation.☆129Updated last year
- FastAPI service on top of WhisperX☆158Updated this week
- ASR (Automatic Speech Recognition) for real-time streamed audio powered by Whisper and tranformers☆36Updated last year
- OpenAI API and Whisper based Video Translation☆74Updated last year
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆108Updated 2 months ago
- ☆34Updated last year
- ASR + diarization model server with speculative decoding☆63Updated last year
- 用文本编辑器剪视频☆37Updated 2 years ago
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.☆87Updated this week
- GPT-4o-level, real-time spoken dialogue system.☆362Updated 10 months ago
- Examples for Cerebrium Serverless GPUs☆514Updated last week
- A simple Google Colab notebook which can translate an original video into multiple languages along with lip sync.☆252Updated 9 months ago
- Local SRT/LLM/TTS Voicechat☆745Updated last year