playht/PlayDiffusion

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/playht/PlayDiffusion)

playht / PlayDiffusion

☆538

Alternatives and similar repositories for PlayDiffusion

Users that are interested in PlayDiffusion are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fluxions-ai / vui
View on GitHub
Real-time voice assistant — WebRTC streaming, faster-whisper ASR, local LLM, Vui Nano (300M) TTS. OpenAI Realtime API compatible. Voice c…
☆727Jul 9, 2026Updated 2 weeks ago
yl4579 / DMOSpeech2
View on GitHub
☆302Jul 22, 2025Updated last year
canopyai / Orpheus-TTS
View on GitHub
Towards Human-Sounding Speech
☆6,258Dec 5, 2025Updated 7 months ago
inclusionAI / Ming-UniAudio
View on GitHub
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
☆450Nov 27, 2025Updated 7 months ago
stepfun-ai / Step-Audio-EditX
View on GitHub
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics…
☆954Apr 9, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ozspeech / OZSpeech
View on GitHub
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆45Feb 9, 2025Updated last year
ace-step / ACE-Step
View on GitHub
ACE-Step: A Step Towards Music Generation Foundation Model
☆4,685Feb 15, 2026Updated 5 months ago
Tencent-Hunyuan / HunyuanVideo-Avatar
View on GitHub
☆2,138Dec 16, 2025Updated 7 months ago
edwko / OuteTTS
View on GitHub
Interface for OuteTTS models.
☆1,436Mar 23, 2026Updated 4 months ago
QwenAudio / CV3-Eval
View on GitHub
☆187Aug 25, 2025Updated 11 months ago
QwenAudio / ThinkSound
View on GitHub
[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Tho…
☆1,372Apr 3, 2026Updated 3 months ago
Marvis-Labs / marvis-tts
View on GitHub
☆365Aug 28, 2025Updated 10 months ago
MYZY-AI / Muyan-TTS
View on GitHub
☆480May 19, 2025Updated last year
kyutai-labs / delayed-streams-modeling
View on GitHub
Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
☆2,982Jan 26, 2026Updated 5 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
meituan-longcat / LongCat-Audio-Codec
View on GitHub
LongCat Audio Tokenizer and Detokenizer
☆301May 9, 2026Updated 2 months ago
ZeyueT / AudioX
View on GitHub
[ICLR 2026] Repository of AudioX
☆1,544Mar 10, 2026Updated 4 months ago
e-c-k-e-r / vall-e
View on GitHub
An unofficial PyTorch implementation of VALL-E
☆88Aug 3, 2025Updated 11 months ago
boson-ai / higgs-audio
View on GitHub
Text-audio foundation model from Boson AI
☆8,298Jun 5, 2026Updated last month
X-LANCE / VoiceFlow-TTS
View on GitHub
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
☆376Sep 3, 2024Updated last year
facebookresearch / audiobox-aesthetics
View on GitHub
Unified automatic quality assessment for speech, music, and sound.
☆745Jun 5, 2025Updated last year
k2-fsa / ZipVoice
View on GitHub
Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching
☆1,018Dec 2, 2025Updated 7 months ago
naver-ai / RapFlow-TTS
View on GitHub
☆56Jul 16, 2025Updated last year
yl4579 / HiFTNet
View on GitHub
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform
☆257Jan 14, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Yaofang-Liu / Pusa-VidGen
View on GitHub
Pusa: Thousands Timesteps Video Diffusion Model
☆686Feb 13, 2026Updated 5 months ago
ictnlp / SLED-TTS
View on GitHub
Streamable Text-to-Speech model using a language modeling approach, without vector quantization
☆108May 20, 2025Updated last year
MoonshotAI / Kimi-Audio
View on GitHub
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
☆4,690Jun 21, 2025Updated last year
FotographerAI / ZenCtrl
View on GitHub
In-context subject-driven image generation while preserving foreground fidelity
☆351Jun 11, 2025Updated last year
WangHelin1997 / SSR-Speech
View on GitHub
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis
☆154Jan 1, 2025Updated last year
maitrix-org / Voila
View on GitHub
☆496May 6, 2025Updated last year
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
XiaomiMiMo / MiMo-Audio
View on GitHub
MiMo-Audio: Audio Language Models are Few-Shot Learners
☆1,066Jun 17, 2026Updated last month
vivoCameraResearch / Magic-TryOn
View on GitHub
MagicTryOn is a video virtual try-on framework based on a large-scale video diffusion Transformer.
☆562Apr 30, 2026Updated 2 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Tencent-Hunyuan / HunyuanCustom
View on GitHub
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
☆1,226Oct 15, 2025Updated 9 months ago
shang0712 / HierTTS
View on GitHub
☆47Apr 16, 2023Updated 3 years ago
magenta / magenta-realtime
View on GitHub
Magenta RealTime 2: An Open-Weights Live Music Model
☆1,693Updated this week
QwenAudio / FunMusic
View on GitHub
A fundamental toolkit designed for music, song, and audio generation
☆1,371May 20, 2025Updated last year
juhayna-zh / AudioControlNet
View on GitHub
Official repository for the paper "Audio ControlNet for Fine-Grained Audio Generation and Editing".
☆77Feb 7, 2026Updated 5 months ago
Omni-Avatar / OmniAvatar
View on GitHub
☆1,847Aug 6, 2025Updated 11 months ago
k2-fsa / Flow2GAN
View on GitHub
Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation
☆145Mar 8, 2026Updated 4 months ago