DongKeon / webrtc-whisper-asrLinks
WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.
☆12Updated 9 months ago
Alternatives and similar repositories for webrtc-whisper-asr
Users that are interested in webrtc-whisper-asr are comparing it to the libraries listed below
Sorting:
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.☆55Updated this week
- A lightweight Python library for running TTS models with a unified API.☆20Updated 4 months ago
- a simple system for 2-way interruptible voice interactions between human and LLM☆29Updated last year
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆20Updated 8 months ago
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆20Updated 7 months ago
- Speaker diarization service☆23Updated 2 months ago
- ☆26Updated 2 years ago
- (WIP) A retrain of F5-TTS on permissively-licensed data☆11Updated 2 months ago
- Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port…☆21Updated 9 months ago
- [WIP] AI Try-On plugin for Chrome☆27Updated last year
- A streaming whisper server for on-prem transcription☆20Updated 10 months ago
- Voice cloning using coqui-TTS☆11Updated last year
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆24Updated last month
- Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.☆63Updated last week
- A python library to find differences between audio and transcriptions☆20Updated last year
- zero-shot realtime TTS system, fully offline, free and open source☆41Updated 2 months ago
- Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.☆14Updated last year
- Voice agent using LiveKit (orchestration), Cartesia (TTS), OpenAI (LLM), and Deepgram (STT)☆17Updated 2 weeks ago
- ☆33Updated 3 months ago
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆15Updated 9 months ago
- Cog wrapper for collabora/WhisperSpeech☆25Updated last year
- A service which wraps and chains video and audio Hugging Face Spaces together☆14Updated 9 months ago
- Multivoice: Enhance your foreign-language movie and TV show experience with personalized dubbed versions. Our project uses voice cloning …☆26Updated last year
- convert a saved pytorch model to gguf and generate as much corresponding ggml c code as possible☆14Updated last year
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated 2 years ago
- Open TTS models, built for streaming on the edge☆43Updated 3 months ago
- ☆15Updated 3 months ago
- Transcription and annotation interface for recorded audio or video files☆35Updated this week
- A collection of notebooks for the Hugging Face blog series (https://huggingface.co/blog).☆45Updated 10 months ago
- A curated list of awesome voice activity detection☆57Updated 7 months ago