kadirnar / VoiceHubLinks
VoiceHub: A Unified Inference Interface for TTS Models
☆52Updated 3 weeks ago
Alternatives and similar repositories for VoiceHub
Users that are interested in VoiceHub are comparing it to the libraries listed below
Sorting:
- Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analy…☆71Updated 5 months ago
- ☆144Updated last month
- A lightweight Python library for running TTS models with a unified API.☆20Updated 7 months ago
- Open TTS models, built for streaming on the edge☆43Updated 6 months ago
- a simple system for 2-way interruptible voice interactions between human and LLM☆30Updated last year
- Orpheus Server with streaming support (TTFB ~160ms)☆13Updated this week
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆68Updated 2 weeks ago
- ☆250Updated 3 weeks ago
- Collection of Open Source Speech Data☆160Updated last week
- Turkish Speech Recognition using Facebook's Wav2vec 2.0 models☆30Updated 3 years ago
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆284Updated 3 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated 9 months ago
- Roomey is a multi-purpose Voice Agent designed to run your personal and business life.☆50Updated 3 months ago
- ☆127Updated 6 months ago
- Voxtral: Convert Mistral into a end2end SpeechLM. No information bottleneck, preserves prosody, learns interruptions from data. Unlike GP…☆33Updated 6 months ago
- Open-source reproducible benchmarks from Argmax☆58Updated this week
- ☆158Updated 2 years ago
- ☆206Updated last year
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆40Updated 4 months ago
- Inference and fine-tuning examples for vision models from 🤗 Transformers☆161Updated last month
- Add real-time Speech-to-Text to your LiveKit application with AssemblyAI☆17Updated 3 months ago
- An example repository to use HuggingFace smolagents, Phidata and CrewAI frameworks with local LLMs☆40Updated 8 months ago
- Speech synthesis (TTS) in low-resource languages by training from scratch with Fastpitch and fine-tuning with HifiGan☆58Updated last year
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.☆24Updated 6 months ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆123Updated last month
- Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers☆57Updated 4 months ago
- ☆62Updated last year
- A WebRTC server that allows you to interact with an LLM using your speech and responds back with generated audio.☆136Updated last year
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.☆74Updated this week
- Self-contained Python lib with zero-dependencies that give you a unified device properties for gpu, cpu, and npu. No more calling separat…☆13Updated 2 weeks ago