ttsds / ttsds_systems
Recipes to create the synthetic data for the benchmarked TTS systems.
☆25Updated 5 months ago
Alternatives and similar repositories for ttsds_systems
Users that are interested in ttsds_systems are comparing it to the libraries listed below
Sorting:
- ☆14Updated 2 months ago
- ☆62Updated 9 months ago
- Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers☆54Updated last month
- ☆214Updated last month
- Misc. tools/scripts that I made to use for tortoise☆21Updated 8 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated last month
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆265Updated 2 months ago
- Awesome music generation model——MG²☆154Updated last month
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆63Updated last week
- VoiceLDM: Text-to-Speech with Environmental Context☆175Updated 9 months ago
- Google's SoundStorm: Efficient Parallel Audio Generation☆132Updated last year
- Open TTS models, built for streaming on the edge☆41Updated 2 months ago
- ☆20Updated last month
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM☆246Updated last month
- ☆40Updated 3 months ago
- A simple, hackable text-to-speech system in PyTorch and MLX☆159Updated 2 months ago
- A lightweight Python library for running TTS models with a unified API.☆18Updated 2 months ago
- A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.☆366Updated 3 weeks ago
- VoiceBox neural network implementation☆107Updated 9 months ago
- Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".☆182Updated 4 months ago
- The official implementation of our paper "Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tu…☆84Updated 8 months ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆178Updated 7 months ago
- ☆256Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 7 months ago
- ☆109Updated 3 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆166Updated 3 weeks ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆71Updated 7 months ago
- SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on On…☆207Updated this week
- The demo page of UniAudio☆33Updated last year
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆52Updated 5 months ago