riteshhere / Speaker_diarization
Speech Diarization for scrum automation
☆102Updated last year
Alternatives and similar repositories for Speaker_diarization:
Users that are interested in Speaker_diarization are comparing it to the libraries listed below
- A lightweight end-to-end text-to-speech model☆112Updated last month
- A toolkit for speaker diarization.☆180Updated 2 weeks ago
- We Speech Transcript based on LLM, in 300 lines of code.☆157Updated last month
- ☆159Updated 4 months ago
- Open source inference code for Rev's model☆395Updated last month
- Live-Transcription (STT) with Whisper PoC☆175Updated 9 months ago
- Voice Transformation for Videos. 🎤👄🎬☆235Updated 6 months ago
- Real time faster whisper gradio☆26Updated 6 months ago
- ☆171Updated 7 months ago
- Whisper realtime streaming for long speech-to-text transcription and translation☆113Updated last year
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆94Updated 11 months ago
- A real-time AI development framework leveraging WebRTC for audio and video transmission.☆114Updated 2 months ago
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆88Updated 6 months ago
- 用文本编辑器剪视频☆37Updated last year
- ☆32Updated last year
- ☆173Updated last year
- This project provides a RESTful API for converting text to speech using Microsoft's Azure Cognitive Services☆94Updated 10 months ago
- openai realtime webrtc python client☆38Updated 3 months ago
- OpenAI API and Whisper based Video Translation☆73Updated 4 months ago
- g1: Using GPT-4o to create o1-like reasoning chains☆20Updated 6 months ago
- Have a natural voice conversation with an LLM☆246Updated 4 months ago
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)☆76Updated 10 months ago
- Nendo is an open source platform for AI-driven audio management, intelligence, and generation.☆120Updated last year
- Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.☆412Updated this week
- RealSI: Open Benchmark for Simultaneous Interpretation in Real-world Scenarios☆53Updated 4 months ago
- Bambo is a new proxy framework. Compared with mainstream frameworks, it is more lightweight and flexible and can handle various load task…☆35Updated 2 months ago
- FastAPI service on top of WhisperX☆81Updated this week
- Engaging in conversation with ChatGPT using voice.☆27Updated last year
- Streaming ASR and TTS based on FastAPI+ sherpa-onnx☆91Updated 6 months ago
- 🎧 Pod-Helper: Real-time audio transcription and repair on consumer hardware☆77Updated last year