DongKeon / webrtc-whisper-asrLinks
WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.
☆12Updated 8 months ago
Alternatives and similar repositories for webrtc-whisper-asr
Users that are interested in webrtc-whisper-asr are comparing it to the libraries listed below
Sorting:
- a simple system for 2-way interruptible voice interactions between human and LLM☆29Updated last year
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆38Updated 2 weeks ago
- A streaming whisper server for on-prem transcription☆20Updated 9 months ago
- Cog wrapper for collabora/WhisperSpeech☆25Updated last year
- This project provides a Flask-based API for generating high-quality text-to-speech (TTS) audio using F5-TTS, a flexible and powerful TTS …☆12Updated 2 months ago
- A lightweight Python library for running TTS models with a unified API.☆18Updated 3 months ago
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.☆53Updated this week
- Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.☆14Updated last year
- Speaker diarization service☆23Updated last month
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆15Updated 9 months ago
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆18Updated 2 weeks ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆19Updated 7 months ago
- ☆20Updated 2 weeks ago
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆45Updated last year
- An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker …☆20Updated 8 months ago
- A curated list of awesome voice activity detection☆54Updated 6 months ago
- llmon-py is a multimodal webui for Llama 3-8B.☆16Updated 11 months ago
- Text To Speech Multilingual Support (+20 Language)☆45Updated 2 years ago
- SadTalker gradio_demo.py file with code section that allows you to set the eye blink and pose reference videos for the software to use wh…☆11Updated last year
- Open TTS models, built for streaming on the edge☆43Updated 2 months ago
- Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.☆56Updated last month
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated last week
- ☆15Updated 2 months ago
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)☆81Updated last year
- (WIP) A retrain of F5-TTS on permissively-licensed data☆11Updated 2 months ago
- Accelerate Whisper tasks such as transcription, by multiprocesing through parallelization☆25Updated 2 years ago
- This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very …☆37Updated 6 months ago
- Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port…☆21Updated 9 months ago
- A lightweight end-of-utterance detection model fine-tuned on SmolLM2-135M, optimized for Raspberry Pi and low-power devices.☆21Updated 2 months ago
- A framework for creating voice based agents. Integrations LLMs with speech recognition and text-to-speech☆33Updated last year