diegotg2000/PitchFlower

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/diegotg2000/PitchFlower)

diegotg2000 / PitchFlower

Official implementation of the paper PitchFlower: A flow-based neural audio codec with pitch controllability

☆36

Alternatives and similar repositories for PitchFlower

Users that are interested in PitchFlower are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

wonjune-kang / expressive-speech-retrieval
View on GitHub
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
☆15Aug 18, 2025Updated 11 months ago
dmlguq456 / TF_Restormer
View on GitHub
Official repository of TF-Restormer for speech restoration
☆15May 14, 2026Updated 2 months ago
Zhongxu-Wang / ArtSpeech
View on GitHub
ArtSpeech: Adaptive Text-to-Speech Synthesis with Articulatory Representations
☆22Sep 21, 2025Updated 10 months ago
JozefColdenhoff / OpenACE
View on GitHub
☆11Aug 1, 2025Updated 11 months ago
wavtechyukky / NHVSing
View on GitHub
Neural Homomorphic Vocoder optimized for singing voice synthesis
☆47Jul 10, 2026Updated 2 weeks ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
xjuspeech / YOLOPitch
View on GitHub
☆10Jun 11, 2024Updated 2 years ago
AmphionTeam / TaDiCodec
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆77Jan 25, 2026Updated 6 months ago
duerig / StyleTTS2
View on GitHub
StyleTTS 2 Optimized Training Fork
☆32Feb 2, 2025Updated last year
IDEA-Emdoor-Lab / UniTTS
View on GitHub
A TTS Trained on Universal Audio.
☆41Jun 6, 2025Updated last year
yxlu-0102 / IDEA-TTS
View on GitHub
Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis
☆27Mar 21, 2025Updated last year
primepake / learnable-speech
View on GitHub
This repo is text to speech with learnable audio encoder without alignment with transcript reference
☆54Sep 20, 2025Updated 10 months ago
kaistmm / VoiceDiT
View on GitHub
[ICASSP2025] Official code for VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis
☆52Apr 9, 2025Updated last year
HeCheng0625 / Diffusion-Speech-Tokenizer
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆198Jan 25, 2026Updated 6 months ago
Bartelds / ctc-dro
View on GitHub
Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.
☆17May 16, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
facebookresearch / lst
View on GitHub
Code for Latent Speech-Text Transformer (LST)
☆35Mar 12, 2026Updated 4 months ago
mbzuai-nlp / sttatts
View on GitHub
☆31Oct 29, 2024Updated last year
TGuichoux / Gelina
View on GitHub
Official implementation of Gelina
☆31Apr 28, 2026Updated 3 months ago
biboamy / AVASpeech_Music_Labels
View on GitHub
☆20Nov 3, 2021Updated 4 years ago
LilDevsy0117 / Ultra-Sortformer
View on GitHub
Ultra-Sortformer for Scalable Speaker Diarization
☆27Apr 9, 2026Updated 3 months ago
yuriak / SpeechDialogueFactory
View on GitHub
☆40Apr 3, 2025Updated last year
Chengyuann / AutoStyle-TTS
View on GitHub
Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…
☆26Feb 1, 2026Updated 5 months ago
daewoung / ViolinDiff
View on GitHub
[ICASSP 2025] Official implementation of "ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning".
☆18Feb 2, 2025Updated last year
adobe-research / openflam
View on GitHub
OpenFLAM: Framewise Language Audio Model
☆110Jun 4, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Twinkzzzzz / MeanSE
View on GitHub
Official implementation of 'MeanSE: Efficient Generative Speech Enhancement with Mean Flows'
☆20Oct 11, 2025Updated 9 months ago
ydqmkkx / ShallowFlowMatching-TTS
View on GitHub
Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis
☆55Sep 20, 2025Updated 10 months ago
lucasnewman / vocos-mlx
View on GitHub
Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX
☆24Oct 30, 2024Updated last year
yukara-ikemiya / Open-Miipher-2
View on GitHub
PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind
☆70Sep 22, 2025Updated 10 months ago
sarulab-speech / Sidon
View on GitHub
Training code and dataset cleasing with Sidon
☆169Apr 24, 2026Updated 3 months ago
neuphonic / neucodec
View on GitHub
A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.
☆161Jun 22, 2026Updated last month
aiola-lab / drax
View on GitHub
Drax: Speech Recognition with Discrete Flow Matching
☆75Oct 15, 2025Updated 9 months ago
zhu-han / SpeechLLM
View on GitHub
LLM-based ASR recipe with Zipformer encoder and Qwen LLM
☆35Sep 25, 2025Updated 10 months ago
SonyCSLParis / codicodec
View on GitHub
Encode and decode audio samples to/from continuous and discrete compressed representations!
☆121Nov 25, 2025Updated 8 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
jiaqili3 / DualCodec
View on GitHub
[Interspeech 2025] DualCodec: A Low-Frame-Rate, Semantically-Enhanced Neural Audio Codec
☆72Mar 11, 2026Updated 4 months ago
roudimit / Omni-R1
View on GitHub
[ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
☆47Nov 21, 2025Updated 8 months ago
ryuclc / CosyVoice2-GRPO
View on GitHub
A simple implementation for improving CosyVoice2 by GRPO method
☆39May 5, 2026Updated 2 months ago
kyutai-labs / tts_longeval
View on GitHub
☆30Apr 29, 2026Updated 3 months ago
yucongzh / online_speaker_diarization
View on GitHub
☆15Jul 11, 2022Updated 4 years ago
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
lars76 / swift-f0
View on GitHub
Fast and accurate fundamental frequency (F0) detector using convolutional neural networks
☆174Sep 2, 2025Updated 10 months ago