ttsds / ttsdbLinks
A database for modern, open-source TTS systems.
☆30Updated last week
Alternatives and similar repositories for ttsdb
Users that are interested in ttsdb are comparing it to the libraries listed below
Sorting:
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆150Updated 2 weeks ago
- A highly optimized engine for neutts-air model to generate minutes of audio in seconds. Over 200x realtime on modern hardware!☆110Updated 2 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆70Updated 3 months ago
- Very fast, accurate speaker diarization☆228Updated this week
- On-device streaming text-to-speech engine powered by deep learning☆128Updated 2 weeks ago
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆29Updated last year
- Create an LJSpeech structured voice dataset on wave input☆37Updated last year
- ☆370Updated 4 months ago
- A simple, hackable text-to-speech system in PyTorch and MLX☆187Updated 6 months ago
- Streaming and Fine-tuning for Chatterbox TTS☆267Updated 7 months ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆135Updated 6 months ago
- Open TTS models, built for streaming on the edge☆45Updated 10 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆219Updated 9 months ago
- Fast audio super resolution from 16khz to 48khz.☆192Updated last month
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆161Updated last year
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆295Updated 8 months ago
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in MLX☆21Updated last year
- DACVAE☆191Updated last month
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆195Updated 9 months ago
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆46Updated 2 years ago
- Kanade is a single-layer disentangled speech tokenizer that extracts compact tokens suitable for both generative and discriminative model…☆68Updated last week
- A highly compressive and high-quality neural audio codec for speech models.☆250Updated 2 weeks ago
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆307Updated 8 months ago
- A lightweight Python library for running TTS models with a unified API.☆21Updated 11 months ago
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆228Updated 8 months ago
- ☆56Updated 3 weeks ago
- A random walk voice style cloning application for Kokoro text to speech☆205Updated 7 months ago
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆53Updated last year
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated last year
- ☆275Updated last year