KoljaB / stream2sentenceLinks
Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.
ā62Updated this week
Alternatives and similar repositories for stream2sentence
Users that are interested in stream2sentence are comparing it to the libraries listed below
Sorting:
- Efficient approach to speaker diarization using voice characteristics extractionā96Updated this week
- š š¤ Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloningā159Updated 11 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelinesā95Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.ā62Updated 3 weeks ago
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)ā82Updated last year
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.ā38Updated last week
- Faster Tortoise inference then Tortoise Fast Forkā127Updated last year
- whisper.cpp bindings for pythonā98Updated last year
- Made slight modifications to the Tortoise API, provided 3 additional scripts to make using Tortoise easier. Less focus on cloning makes sā¦ā52Updated last year
- š¬ ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.ā214Updated 7 months ago
- Whisper realtime streaming for long speech-to-text transcription and translationā119Updated last year
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.ā52Updated 6 months ago
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.ā55Updated this week
- ā365Updated 9 months ago
- G2Pā258Updated last month
- Open TTS models, built for streaming on the edgeā43Updated 3 months ago
- ā234Updated this week
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusionā180Updated 8 months ago
- streaming speech to text server using Whisperā92Updated 2 years ago
- A random walk voice style cloning application for Kokoro text to speechā98Updated this week
- Joint speech-language model - respond directly to audio!ā30Updated last year
- LlamaVoice is a llama-based large voice generation model, providing inference and training ability.ā233Updated 9 months ago
- ā258Updated last year
- A TTS model capable of generating ultra-realistic dialogue in one pass.ā174Updated 2 months ago
- Simulates talk with an AI that can express emotionsā71Updated this week
- FastAPI service on top of WhisperXā109Updated this week
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.ā46Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.ā137Updated last year
- Real-time Speech-Text Foundation Model Toolkit (wip)ā237Updated 2 months ago
- Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.ā64Updated last year