StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
☆37May 17, 2025Updated 11 months ago
Alternatives and similar repositories for StyleTTS2
Users that are interested in StyleTTS2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆74Mar 21, 2025Updated last year
- This project contains an NVIDIA AI Workbench project for easy installation.☆12May 30, 2024Updated last year
- Misc. tools/scripts that I made to use for tortoise☆21Aug 19, 2024Updated last year
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆54Apr 13, 2026Updated 3 weeks ago
- A simple module for making a request to the tortoise gradio page.☆17Jun 10, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆101Aug 14, 2024Updated last year
- ☆528Feb 21, 2026Updated 2 months ago
- Bellingcat Hackathon - Digital Investigation Tool 2022☆13Sep 25, 2022Updated 3 years ago
- Silero TTS web UI☆14Jan 30, 2024Updated 2 years ago
- A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching" that provides an OpenAI-compatibl…☆14Feb 7, 2025Updated last year
- Drax: Speech Recognition with Discrete Flow Matching☆75Oct 15, 2025Updated 6 months ago
- Lightweight offline Linux command tutor using a local LLM and ChromaDB.☆13Apr 27, 2025Updated last year
- raster2laser_gcode☆14Dec 29, 2022Updated 3 years ago
- Simple Dash Effect Unity☆14Mar 18, 2021Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Performs the entire AI cover generation process with UI☆29Aug 4, 2025Updated 9 months ago
- This is the Github Pages repository for TrashBot!☆10Mar 8, 2021Updated 5 years ago
- ☆15Jun 27, 2025Updated 10 months ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆188Sep 27, 2024Updated last year
- process video frame by frame inside "Extras" tab☆20Sep 22, 2024Updated last year
- An unofficial ComfyUI custom node integration for High-quality Text-to-Speech and Voice Conversion nodes for ComfyUI using ResembleAI's C…☆23Jun 4, 2025Updated 11 months ago
- ☆139Mar 11, 2025Updated last year
- Dự án công cụ chuyển đổi giọng nói dành cho người Việt☆26Mar 21, 2026Updated last month
- Makes camera aiming in GTA SA similar to GTA 4☆13Sep 22, 2020Updated 5 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Authenticate against OAuth2 Provider in Python CLIs☆19Apr 17, 2026Updated 2 weeks ago
- [DEPRECATED! NO LONGER MAINTAINED!] A free multimedia player for Windows based on libmpv and Qt.☆11Nov 27, 2018Updated 7 years ago
- A multi-voice TTS system trained with an emphasis on quality☆23Nov 6, 2023Updated 2 years ago
- TTS pipeline that uses RVC to enhance audio quality and cloning☆147Jan 25, 2024Updated 2 years ago
- Automatic audiovisual translation with lip-syncing☆10Dec 21, 2019Updated 6 years ago
- Using RVC via console or python scripts☆148Oct 18, 2024Updated last year
- Custom ComfyUI node set for managing long-running, prompt-driven video projects. Includes VantageProject for project management and two s…☆43Sep 25, 2025Updated 7 months ago
- Gradio UI for training video models using finetrainers☆33Apr 18, 2025Updated last year
- [SIGGRAPH Asia'25] Enabling Reference-based Camera Control via Context without Explicit 3D Estimation☆157Jan 18, 2026Updated 3 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ComfyUI-Bagel is now available in ComfyUI, BAGEL is an open‑source multimodal foundation model with 7B active parameters (14B total) trai…☆29May 28, 2025Updated 11 months ago
- Official repository of Tapir Lab.'s Lip-Sync Method☆10Oct 3, 2023Updated 2 years ago
- GarageExtender for GTA SA developed by Link2012☆11Jul 17, 2021Updated 4 years ago
- zerodim-ffhq-x256 model in sd-webui☆20Aug 1, 2024Updated last year
- a simple wrapper of fooocus prompt expansion engine in stable-diffusion-webui☆23Jun 9, 2024Updated last year
- Streaming ProPainter☆15Sep 18, 2024Updated last year
- High-performance ASR tool using Faster Whisper, supporting custom models, multi-language transcription, and real-time processing feedback…☆10Sep 17, 2025Updated 7 months ago