nene1212/MaskGCT-Training

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nene1212/MaskGCT-Training)

nene1212 / MaskGCT-Training

Training code for MaskGCT-T2S model.

☆25

Alternatives and similar repositories for MaskGCT-Training

Users that are interested in MaskGCT-Training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rishikksh20 / voxtral-codec-pytoch
View on GitHub
Voxtral Codec : Combining Semantic VQ and Acoustic FSQ for Ultra-Low Bitrate Speech Generation (Voxtral TTS Backbone)
☆15Mar 27, 2026Updated 4 months ago
exercise-book-yq / FreeCodec
View on GitHub
FREECODEC: A DISENTANGLED NEURAL SPEECH CODEC WITH FEWER TOKENS
☆24Sep 9, 2024Updated last year
bfs18 / e2_tts
View on GitHub
☆69Sep 3, 2024Updated last year
kimsunwiub / BLOOM-Net
View on GitHub
Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"
☆14Feb 13, 2022Updated 4 years ago
SparkAudio / VoxBox
View on GitHub
A large-scale speech corpus introduced in Spark-TTS, built from diverse open-source datasets for training text-to-speech (TTS) systems.
☆115May 5, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
haoheliu / torchsubband
View on GitHub
Pytorch implementation of subband decomposition
☆93Jul 26, 2022Updated 4 years ago
hbwu-ntu / EmoCtrlTTS-Eval
View on GitHub
☆19Aug 23, 2024Updated last year
Edresson / ZS-TTS-Evaluation
View on GitHub
☆45Sep 19, 2024Updated last year
ga642381 / AudioCodec-Hub
View on GitHub
AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models
☆25Sep 26, 2023Updated 2 years ago
yluo42 / GC3
View on GitHub
☆51May 16, 2021Updated 5 years ago
XXH333 / WordVoice-main
View on GitHub
The inference and trainging code for WordVoice.
☆66Updated this week
shansongliu / HumTrans
View on GitHub
☆13Sep 26, 2023Updated 2 years ago
omine-me / LaughterSegmentation
View on GitHub
2024 Latest laughter detection & segmentaion model. Paper: "Robust Laughter Segmentation with Automatic Diverse Data Synthesis", Interspe…
☆66Sep 1, 2024Updated last year
ajd12342 / paraspeechcaps
View on GitHub
Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'
☆165Mar 26, 2026Updated 4 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yzyouzhang / SASV_PR
View on GitHub
Official implementation of the Odyssey paper "A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification"
☆18Jun 24, 2022Updated 4 years ago
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated last year
MTG / SingWithExpressions
View on GitHub
This is the accompanying repository to the paper - Automatic Estimation of Singing Voice Musical Dynamics
☆16Oct 28, 2024Updated last year
Soul-AILab / SAC
View on GitHub
[ACL 2026 Main] Training, inference, and testing of the SAC speech codec model.
☆108Nov 1, 2025Updated 8 months ago
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆54May 1, 2025Updated last year
Soul-AILab / SoulX-Singer-Eval
View on GitHub
A Benchmark and Evaluation Suite for Zero-shot Singing Voice Synthesis
☆33Feb 11, 2026Updated 5 months ago
chadqiu / Shared-bicycle-lock
View on GitHub
以嵌入式stm32单片机为控制芯片，用SIM800c实现联网通信，以阿里云服务器为通信中介及数据中心，以PHP网页为通信媒介，Android软件为用户交互工具的物联网共享单车锁
☆12Feb 13, 2019Updated 7 years ago
Shy-98 / MELLE
View on GitHub
Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"
☆41Jun 28, 2025Updated last year
jerrygood0703 / noise_adaptive_DAT_SE
View on GitHub
Noise Adaptive Speech Enhancement using Domain Adversarial Training
☆23Jul 25, 2019Updated 7 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
MorenoLaQuatra / ARCH
View on GitHub
ARCH: Audio Representations benCHmark
☆57Aug 26, 2024Updated last year
yzyouzhang / hrtf_field
View on GitHub
Official implementation of the ICASSP 2023 paper "HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields"
☆28Dec 3, 2023Updated 2 years ago
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
unilight / sheet
View on GitHub
Speech Human Evaluation Estimation Toolkit (SHEET)
☆138Mar 31, 2026Updated 3 months ago
ehabets / Signal-Generator
View on GitHub
Generate audio signals corresponding to moving sources/receivers in a shoebox-shaped room (MATLAB)
☆39Jan 25, 2021Updated 5 years ago
mubtasimahasan / DM-Codec
View on GitHub
Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”
☆57Jun 1, 2025Updated last year
ryota-komatsu / speech_resynth
View on GitHub
Speech Resynthesis and Language Modeling
☆27Jun 11, 2025Updated last year
yangdongchao / RSTnet
View on GitHub
Real-time Speech-Text Foundation Model Toolkit (wip)
☆255Mar 26, 2025Updated last year
Andong-Li-speech / Neural-Vocoders-as-Speech-Enhancers
View on GitHub
☆52Sep 10, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Choddeok / EmoSphere-TTS
View on GitHub
[INTERSPEECH 2024] The official implementation of EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for …
☆182Jul 16, 2026Updated last week
p0p4k / Matcha-TTS-2
View on GitHub
E2E TTS using Conditional Flow Matching (Experimental*)
☆71Nov 10, 2023Updated 2 years ago
AmphionTeam / Emilia-NV
View on GitHub
Official Repository of Paper: "Emilia-NV: A Non-Verbal Speech Dataset with Word-Level Annotation for Human-Like Speech Modeling"
☆92Sep 18, 2025Updated 10 months ago
jeremychee4 / AffectSpeech
View on GitHub
AffectSpeech: A Large-Scale Emotional Speech Dataset with Fine-Grained Textual Descriptions for Speech Emotion Captioning and Synthesis
☆68Jun 12, 2026Updated last month
facebookresearch / lst
View on GitHub
Code for Latent Speech-Text Transformer (LST)
☆35Mar 12, 2026Updated 4 months ago
furkanarius / Multichannel-Speech-Enhancement-with-Deep-Neural-Networks
View on GitHub
This thesis applies an autoencoder deep neural network to the multichannel speech enhancement problem. It takes the problem from dataset …
☆14Sep 1, 2022Updated 3 years ago
BayLing-Models / BayLing-Duplex
View on GitHub
Native full-duplex speech dialogue inference for BayLing-Duplex.
☆63Jun 22, 2026Updated last month