ποΈ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets
β141Aug 10, 2025Updated 10 months ago
Alternatives and similar repositories for TTSizer
Users that are interested in TTSizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical reportβ47Sep 2, 2025Updated 9 months ago
- β101Jan 19, 2026Updated 5 months ago
- β41Jul 15, 2025Updated 11 months ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into oneβ26Aug 5, 2024Updated last year
- Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesisβ27Mar 21, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- poorman's ar-dit ttsβ45Dec 31, 2025Updated 6 months ago
- High quality text-to-speech based on StyleTTS 2.β78Apr 6, 2026Updated 2 months ago
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matchingβ45Feb 9, 2025Updated last year
- β25Mar 6, 2024Updated 2 years ago
- β25Feb 14, 2026Updated 4 months ago
- Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'β162Mar 26, 2026Updated 3 months ago
- Official implementation of "Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis",β¦β80May 29, 2023Updated 3 years ago
- FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.β250Feb 25, 2026Updated 4 months ago
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transformeβ¦β45Sep 5, 2025Updated 9 months ago
- Open source password manager - Proton Pass β’ AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"β217Sep 19, 2024Updated last year
- β15Nov 11, 2024Updated last year
- GPT-style network for phonemization with durations of textβ69Mar 21, 2024Updated 2 years ago
- [ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"β374Sep 3, 2024Updated last year
- This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in β¦β57Aug 9, 2025Updated 10 months ago
- The demo page for ALMTokenizerβ59Apr 14, 2025Updated last year
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMsβ57Nov 19, 2025Updated 7 months ago
- β19Mar 22, 2024Updated 2 years ago
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]β25Jul 5, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- β26Sep 22, 2022Updated 3 years ago
- All generative model in one for better TTS modelβ74Sep 8, 2024Updated last year
- Train the next generation of TTS systems.β169Sep 13, 2024Updated last year
- [NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matchingβ127Apr 8, 2026Updated 2 months ago
- β12Nov 7, 2024Updated last year
- text to speechβ10Mar 19, 2024Updated 2 years ago
- β18Feb 9, 2020Updated 6 years ago
- Parallel waveform generation with DiffusionGANβ17Mar 26, 2022Updated 4 years ago
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech β¦β28Nov 7, 2025Updated 7 months ago
- End-to-end encrypted cloud storage - Proton Drive β’ AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" accβ¦β77Jul 16, 2023Updated 2 years ago
- This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaDβ¦β198Jan 25, 2026Updated 5 months ago
- Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"β36Oct 23, 2025Updated 8 months ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modellingβ100Nov 9, 2024Updated last year
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open β¦β23May 19, 2026Updated last month
- GPT for FACodecβ13Mar 25, 2024Updated 2 years ago
- β36Sep 6, 2025Updated 9 months ago