kadirnar / VoiceHubLinks
VoiceHub: A Unified Inference Interface for TTS Models
☆42Updated last week
Alternatives and similar repositories for VoiceHub
Users that are interested in VoiceHub are comparing it to the libraries listed below
Sorting:
- Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analy…☆70Updated 3 months ago
- Turkish Speech Recognition using Facebook's Wav2vec 2.0 models☆28Updated 3 years ago
- Open TTS models, built for streaming on the edge☆43Updated 4 months ago
- A lightweight Python library for running TTS models with a unified API.☆20Updated 4 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆63Updated last month
- a simple system for 2-way interruptible voice interactions between human and LLM☆30Updated last year
- Open-source and reproducible benchmarks for Speaker Diarization☆29Updated last week
- Collection of Open Source Speech Data☆159Updated 8 months ago
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆34Updated last month
- Audio tokenization, in the fastest way possible!☆52Updated 10 months ago
- Roomey is a multi-purpose Voice Agent designed to run your personal and business life.☆34Updated last month
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆90Updated last month
- Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan☆58Updated last year
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.☆20Updated 4 months ago
- ☆51Updated 2 weeks ago
- Onnx compatible styletts2 code☆12Updated last month
- ☆16Updated 4 months ago
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆19Updated 5 months ago
- Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for the Whisper-medium, designed to enhance performance on mul…☆16Updated 5 months ago
- An example repository to use HuggingFace smolagents, Phidata and CrewAI frameworks with local LLMs☆38Updated 6 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 2 months ago
- 🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.☆27Updated 4 months ago
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆19Updated last month
- A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisper☆28Updated 11 months ago
- The Vokan Architecture (Tsukasa speech based)☆10Updated 5 months ago
- Dippy Synthetic Speech Subnet☆16Updated last month
- Joint speech-language model - respond directly to audio!☆30Updated last year
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆261Updated last month
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆63Updated 9 months ago
- ☆85Updated last year