maskgct / maskgct.github.ioLinks

MaskGCT demo page

☆14

Alternatives and similar repositories for maskgct.github.io

Users that are interested in maskgct.github.io are comparing it to the libraries listed below

Sorting:

xinchen-ai / Westlake-Omni
☆204Updated last year
douhaohaode / xtts_v2
☆70Updated 2 years ago
KdaiP / StableTTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
☆434Updated last year
LiuZH-19 / SongGen
[ICML 2025] SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
☆299Updated 3 months ago
inclusionAI / Ming-UniAudio
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
☆429Updated 2 months ago
OpenMOSS / SpeechGPT-2.0-preview
GPT-4o-level, real-time spoken dialogue system.
☆369Updated last year
maitrix-org / Voila
☆484Updated 9 months ago
warmshao / ChatTTSPlus
Extension of ChatTTS, 3x Faster on Windows, Support Voice Cloning and Mobile Deployment
☆172Updated last year
MYZY-AI / Muyan-TTS
☆474Updated 8 months ago
huangxu1991 / GPT-SoVITS-VC
VC Without Retrain!
☆129Updated last year
thuhcsi / NeuCoSVC
☆298Updated last year
MooreThreads / MooER
MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction mode…
☆219Updated last year
jzq2000 / MoonCast
☆343Updated 9 months ago
yrom / finetune-index-tts
IndexTTS Fine-tuning notebooks
☆132Updated 7 months ago
open-mmlab / FoleyCrafter
[IJCV] FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师，给你的无声视频添加生动而且同步的音效 😝
☆643Updated last year
shansongliu / MuMu-LLaMA
This is the official repository for M2UGen
☆511Updated last year
bytedance / Make-An-Audio-2
a text-conditional diffusion probabilistic model capable of generating high fidelity audio.
☆188Updated last year
FireRedTeam / FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System
☆895Updated 4 months ago
YoMio-Tech-Inc / GPT-SoVITS2
GPT-SoVITS2
☆229Updated last year
nethermanpro / transvip
☆167Updated last year
haidog-yaqub / EzAudio
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
☆329Updated last month
WGS-note / F5_TTS_Faster
F5-TTS 推理加速，速度提升约4倍！
☆122Updated last year
innnky / ar-vits
text to speech using autoregressive transformer and VITS
☆249Updated last year
FunAudioLLM / FunAudioLLM-APP
☆379Updated last year
jdh-algo / JoyHallo
JoyHallo: Digital human model for Mandarin
☆522Updated 4 months ago
byteresearchcla / RealSI
RealSI: Open Benchmark for Simultaneous Interpretation in Real-world Scenarios
☆79Updated 7 months ago
wenet-e2e / wesr
We Speech Transcript based on LLM, in 300 lines of code.
☆183Updated 7 months ago
JimmyMa99 / train-higgs-audio
Text-audio foundation model from Boson AI
☆117Updated 5 months ago
LSimon95 / megatts2
Unoffical implementation of Megatts2
☆288Updated last year
ASLP-lab / OSUM
OSUM & OSUM-EChat, open speech understanding model and empathetic spoken chatbot based on it, open-sourced by ASLP@NPU.
☆475Updated 2 months ago