🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning
☆161Jul 15, 2024Updated last year
Alternatives and similar repositories for StyleTTS2
Users that are interested in StyleTTS2 are comparing it to the libraries listed below
Sorting:
- ☆16Apr 23, 2024Updated last year
- Text-Guided Generation of Full-Body Image with Preserved Reference Face for Customized Animation☆24Jun 24, 2024Updated last year
- ☆19Jul 11, 2024Updated last year
- 🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.☆258Jun 10, 2024Updated last year
- Fine Tune the Style-TTS2 Voice Model☆269Jun 17, 2025Updated 8 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆6,187Aug 10, 2024Updated last year
- Dungeon procedural generator similar to whatabou's "One Page Dungeon"☆50Jan 4, 2026Updated last month
- ☆100Apr 27, 2024Updated last year
- NOVA-3D: Non-overlapped Views for 3D Anime Character Reconstruction☆26Mar 14, 2024Updated last year
- ☆20Jun 26, 2024Updated last year
- Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.☆16Feb 4, 2024Updated 2 years ago
- ComfyUI style LDM patching in A1111☆53Jun 11, 2024Updated last year
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Apr 18, 2024Updated last year
- This project includes a Python script for fine-tuning a text-to-speech (TTS) model. The script utilizes custom datasets and use CUDA for …☆13Oct 4, 2024Updated last year
- Create Unmute voice embeddings☆24Nov 15, 2025Updated 3 months ago
- ☆12Mar 18, 2024Updated last year
- Controllable and fast Text-to-Speech for over 7000 languages!☆2,188Jan 25, 2026Updated last month
- This is the official repository for "LatentMan: Generating Consistent Animated Characters using Image Diffusion Models" [CVPRW 2024]☆22Jul 21, 2024Updated last year
- ☆24May 22, 2024Updated last year
- STDFormer: Spatio Temporal Disentanglement Learning for 3D Human Mesh Recovery from Monocular Videos with Transformer☆45Mar 14, 2024Updated last year
- SpeechPlus: Small LLM-Based Text-to-Speech Library 🚀☆20May 20, 2025Updated 9 months ago
- Application of MB-iSTFT-VITS components to vits2_pytorch☆133Dec 29, 2025Updated 2 months ago
- ☆33Aug 9, 2024Updated last year
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆268Jan 13, 2025Updated last year
- animatediff prompt travel☆19Jan 27, 2024Updated 2 years ago
- We introduce OpenStory++, a large-scale open-domain dataset focusing on enabling MLLMs to perform storytelling generation tasks.☆16Aug 30, 2024Updated last year
- ☆14Oct 16, 2023Updated 2 years ago
- Program that enables seamless interaction with your documents through an advanced vector database and the power of Large Language Model (…☆18Sep 12, 2023Updated 2 years ago
- Official repository for VQDM:Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization paper☆34Sep 17, 2024Updated last year
- An Open Source text-to-speech system built by inverting Whisper.☆4,567Dec 14, 2025Updated 2 months ago
- Official Implementation of StyleTTS☆462Jan 13, 2025Updated last year
- A ggml (C++) re-implementation of tortoise-tts☆193Aug 20, 2024Updated last year
- Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis (ICCV, 2025)☆52Jan 14, 2026Updated last month
- An unofficial PyTorch implementation of VALL-E☆88Aug 3, 2025Updated 7 months ago
- ☆22Aug 31, 2024Updated last year
- A single Gradio + React WebUI with extensions for ACE-Step, Kimi Audio, Piper TTS, GPT-SoVITS, CosyVoice, XTTSv2, DIA, Kokoro, OpenVoice,…☆2,989Feb 19, 2026Updated last week
- Blender addon for 4D Humans/WHAM/SLAHMR Projects☆40Jun 17, 2024Updated last year
- ☆16Apr 7, 2024Updated last year
- OminiControl for the GPU Poor☆39Jan 27, 2025Updated last year