supertone-inc/super-monotonic-align

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/supertone-inc/super-monotonic-align)

supertone-inc / super-monotonic-align

☆173

Alternatives and similar repositories for super-monotonic-align

Users that are interested in super-monotonic-align are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

maum-ai / phaseaug
View on GitHub
ICASSP 2023 Accepted
☆191May 6, 2024Updated 2 years ago
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
keonlee9420 / evaluate-zero-shot-tts
View on GitHub
Evaluation Protocol for Large-Scale Zero-Shot TTS Literature
☆97Mar 12, 2025Updated last year
line / LibriTTS-P
View on GitHub
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
☆161Jun 13, 2024Updated 2 years ago
zhenye234 / FlashSpeech
View on GitHub
ACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis
☆156Sep 20, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
hs-oh-prml / DiffProsody
View on GitHub
☆69Jul 29, 2023Updated 3 years ago
KdaiP / StableTTS
View on GitHub
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
☆437Sep 13, 2024Updated last year
junjun3518 / alias-free-torch
View on GitHub
Simple torch.nn.module implementation of Alias-Free-GAN style filter and resample
☆101Jul 26, 2022Updated 4 years ago
winddori2002 / DEX-TTS
View on GitHub
DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability
☆108Jan 17, 2025Updated last year
jhcodec843 / jhcodec
View on GitHub
☆48Updated this week
p0p4k / pflowtts_pytorch
View on GitHub
Unofficial implementation of NVIDIA P-Flow TTS paper
☆228Dec 24, 2024Updated last year
maum-ai / maum-ai.github.io
View on GitHub
maum-ai.github.io
☆15Jun 12, 2026Updated last month
anonymous-pits / pits
View on GitHub
PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor
☆280Jul 16, 2023Updated 3 years ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
BakerBunker / FreeV
View on GitHub
[InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
☆98Jul 4, 2024Updated 2 years ago
sh-lee-prml / BigVGAN
View on GitHub
Unofficial pytorch implementation of BigVGAN: A Universal Neural Vocoder with Large-Scale Training
☆136Feb 18, 2023Updated 3 years ago
ncsoft / avocodo
View on GitHub
Official implementation of "Avocodo: Generative Adversarial Network for Artifact-Free Vocoder" (AAAI2023)
☆154Feb 1, 2023Updated 3 years ago
tarepan / SpeechMOS
View on GitHub
Easy-to-Use Speech MOS predictors
☆362Oct 24, 2023Updated 2 years ago
lifeiteng / naturalspeech3_facodec
View on GitHub
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
☆254Apr 20, 2024Updated 2 years ago
X-LANCE / VoiceFlow-TTS
View on GitHub
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
☆376Sep 3, 2024Updated last year
sarulab-speech / UTMOSv2
View on GitHub
UTokyo-SaruLab MOS Prediction System
☆357Apr 2, 2026Updated 3 months ago
AI-S2-Lab / FluentEditor
View on GitHub
[InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency
☆62Oct 23, 2024Updated last year
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
naver-ai / usdm
View on GitHub
Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)
☆95Dec 3, 2024Updated last year
WangHelin1997 / SSR-Speech
View on GitHub
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis
☆154Jan 1, 2025Updated last year
liutaocode / TTS-arxiv-daily
View on GitHub
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
☆663Updated this week
ZhikangNiu / A-DMA
View on GitHub
[INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"
☆67Jun 16, 2025Updated last year
hayeong0 / DDDM-VC
View on GitHub
Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for V…
☆244Jul 31, 2024Updated last year
zhai-lw / SQCodec
View on GitHub
A lightweight audio codec based on a single quantizer
☆72Aug 15, 2025Updated 11 months ago
scutcsq / Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction
View on GitHub
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…
☆60Apr 4, 2024Updated 2 years ago
p0p4k / Matcha-TTS-2
View on GitHub
E2E TTS using Conditional Flow Matching (Experimental*)
☆71Nov 10, 2023Updated 2 years ago
sh-lee-prml / PeriodWave
View on GitHub
The official Implementation of PeriodWave and PeriodWave-Turbo
☆226Apr 14, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / lst
View on GitHub
Code for Latent Speech-Text Transformer (LST)
☆35Mar 12, 2026Updated 4 months ago
k2-fsa / libriheavy
View on GitHub
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
☆220Sep 10, 2024Updated last year
yl4579 / HiFTNet
View on GitHub
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform
☆258Jan 14, 2025Updated last year
maum-ai / univnet
View on GitHub
Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)
☆286Oct 8, 2021Updated 4 years ago
dhchoi99 / NANSY
View on GitHub
☆171Jul 25, 2022Updated 4 years ago
NVIDIA / radtts
View on GitHub
Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, …
☆291Apr 6, 2023Updated 3 years ago
glory20h / VoiceLDM
View on GitHub
VoiceLDM: Text-to-Speech with Environmental Context
☆194Aug 9, 2024Updated last year