☆100Apr 27, 2024Updated last year
Alternatives and similar repositories for StyleTTS2
Users that are interested in StyleTTS2 are comparing it to the libraries listed below
Sorting:
- Fine Tune the Style-TTS2 Voice Model☆266Jun 17, 2025Updated 8 months ago
- Clean and modernized implementation of FastSpeech2/LightSpeech using IPA☆18Aug 16, 2024Updated last year
- ☆54Jul 16, 2025Updated 7 months ago
- Just another FastSpeech 2 but cleaner code :)☆29Jun 28, 2024Updated last year
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…☆14Dec 30, 2023Updated 2 years ago
- StyleTTS 2 Optimized Training Fork☆33Feb 2, 2025Updated last year
- Project of Singing Voice Conversion.☆16Oct 27, 2023Updated 2 years ago
- 44100Hz日本語音源に対応させた unofficial vits2-TTS implementation in pytorchです。☆24Sep 1, 2023Updated 2 years ago
- Small tools to enhance your AI app with little effort.☆12Jan 9, 2024Updated 2 years ago
- HiFTNet wav/audio super-resolution 16/24 kHz to 48 kHz☆24Jan 2, 2024Updated 2 years ago
- High quality text-to-speech based on StyleTTS 2.☆73Updated this week
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆161Jul 15, 2024Updated last year
- OllaDeck is a purple technology stack for Generative AI (text modality) cybersecurity. It provides a comprehensive set of tools for both …☆18Sep 21, 2024Updated last year
- Export an ONNX graph that performs ISTFT. Designed for TTS models.☆27Apr 23, 2024Updated last year
- source code of EfficientTTS 2☆20Feb 18, 2024Updated 2 years ago
- [Early Alpha] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activit…☆22Jan 10, 2025Updated last year
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆6,172Aug 10, 2024Updated last year
- Speech enhancement in noisy and reverberant environments using deep neural networks☆22Oct 10, 2025Updated 4 months ago
- ArtSpeech: Adaptive Text-to-Speech Synthesis with Articulatory Representations☆21Sep 21, 2025Updated 5 months ago
- ☆38Apr 15, 2024Updated last year
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆30May 27, 2023Updated 2 years ago
- Simple LLM inference server☆20Jun 13, 2024Updated last year
- A Python-based voice assistant integrating speech-to-text (STT), text-to-speech (TTS), and powerful AI capabilities using either a local …☆13Dec 8, 2025Updated 2 months ago
- LlamaCards is a web application that provides a dynamic interface for interacting with LLM models in real-time. This app allows users to …☆38Aug 28, 2024Updated last year
- Caption, translate, and optionally record in real time "what you hear" from speakers and microphone. Never miss part of the conversation …☆23Sep 11, 2025Updated 5 months ago
- ☆23Oct 17, 2024Updated last year
- 🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.☆258Jun 10, 2024Updated last year
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆49Sep 2, 2025Updated 5 months ago
- A collection of all our phonemeizers for dataset construction and inference☆27Feb 21, 2025Updated last year
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆29Oct 15, 2024Updated last year
- Examples of using the llasa-tts models locally☆182Apr 20, 2025Updated 10 months ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.☆48Sep 15, 2025Updated 5 months ago
- ☆25Jan 24, 2023Updated 3 years ago
- Official Implementation of StyleTTS☆462Jan 13, 2025Updated last year
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation☆15Aug 1, 2024Updated last year
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 3 months ago
- [INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset☆12Sep 29, 2025Updated 5 months ago