frothywater/kanade-tokenizer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/frothywater/kanade-tokenizer)

frothywater / kanade-tokenizer

Kanade is a single-layer disentangled speech tokenizer that extracts compact tokens suitable for both generative and discriminative modeling.

☆108

Alternatives and similar repositories for kanade-tokenizer

Users that are interested in kanade-tokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Aratako / MioCodec
View on GitHub
☆27Feb 14, 2026Updated 5 months ago
ysharma3501 / LinaCodec
View on GitHub
A highly compressive and high-quality neural audio codec for speech models.
☆269Jan 23, 2026Updated 6 months ago
ozspeech / OZSpeech
View on GitHub
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆45Feb 9, 2025Updated last year
sizigi / AliasingFreeNeuralAudioSynthesis
View on GitHub
☆64Dec 24, 2025Updated 7 months ago
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
AmphionTeam / TaDiCodec
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆77Jan 25, 2026Updated 5 months ago
lucadellalib / focalcodec
View on GitHub
A low-bitrate single-codebook 16 / 24 kHz speech codec based on focal modulation
☆173Nov 30, 2025Updated 7 months ago
Stylish-TTS / stylish-tts
View on GitHub
High quality text-to-speech based on StyleTTS 2.
☆78Apr 6, 2026Updated 3 months ago
AmphionTeam / FlexiCodec
View on GitHub
[ICLR2026] FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates
☆50Jul 1, 2026Updated 3 weeks ago
yangdongchao / ALMTokenizer2
View on GitHub
The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…
☆45Sep 5, 2025Updated 10 months ago
HeCheng0625 / Diffusion-Speech-Tokenizer
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆198Jan 25, 2026Updated 5 months ago
llm-jp / llama-mimi
View on GitHub
Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…
☆31Sep 20, 2025Updated 10 months ago
ASLP-lab / FlashTTS
View on GitHub
Fast Streaming TTS with MTP Acceleration and X-pred Mean Flow Distillation
☆67Jun 16, 2026Updated last month
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
smallbraineng / smalltts
View on GitHub
superfast text to speech in any voice
☆62Feb 16, 2026Updated 5 months ago
RayYuki / CodecBench
View on GitHub
☆24Nov 16, 2025Updated 8 months ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
zhai-lw / SQCodec
View on GitHub
A lightweight audio codec based on a single quantizer
☆72Aug 15, 2025Updated 11 months ago
lucadellalib / dycast
View on GitHub
A variable-frame-rate 16 kHz speech codec based on FocalCodec
☆20Feb 11, 2026Updated 5 months ago
ZhikangNiu / Semantic-VAE
View on GitHub
[INTERSPEECH 2026 Oral]Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
☆120Jun 21, 2026Updated last month
Lab-MSP / NaturalVoices
View on GitHub
☆33Oct 28, 2025Updated 8 months ago
Wataru-Nakata / latentlm-tts
View on GitHub
☆29Jul 3, 2026Updated 3 weeks ago
Aratako / T5Gemma-TTS
View on GitHub
Multilingual TTS model with voice cloning and duration control, based on T5Gemma encoder-decoder LLM
☆311Apr 3, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ZhangXinWhut / SimWhisper-Codec
View on GitHub
Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"
☆37Jan 28, 2026Updated 5 months ago
YangXusheng-yxs / CodecFormer_5Hz
View on GitHub
☆35Oct 23, 2025Updated 9 months ago
haoheliu / SemantiCodec-inference
View on GitHub
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
☆255Mar 7, 2025Updated last year
tabahi / contexless-phonemes-CUPE
View on GitHub
pytorch model for contexless-phoneme prediction from speech audio
☆32Oct 30, 2025Updated 8 months ago
line / WaveTrainerFit
View on GitHub
Official implementation of "Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration towards High-Quality Speech G…
☆16Feb 6, 2026Updated 5 months ago
Ereboas / MagiCodec
View on GitHub
A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.
☆125Jun 4, 2025Updated last year
nonverbalspeech38k / nonverspeech38k
View on GitHub
The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…
☆68Dec 26, 2025Updated 6 months ago
unilight / jatts
View on GitHub
JATTS: A modern, research-oriented Japanese Text-to-speech Open-sourced Toolkit
☆43Mar 13, 2026Updated 4 months ago
meaningTeam / tidy-tunes
View on GitHub
Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …
☆23May 19, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
exercise-book-yq / Supercodec
View on GitHub
☆51Mar 5, 2026Updated 4 months ago
ttsds / ttsds
View on GitHub
The TTSDS benchmark evaluates synthetic speech quality by considering prosody, speaker identity, and intelligibility, comparing these fac…
☆97Jul 7, 2026Updated 2 weeks ago
gyt1145028706 / XY-Tokenizer
View on GitHub
This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs
☆97Sep 19, 2025Updated 10 months ago
bigai-nlco / UltraVoice
View on GitHub
Official Repository of UltraVoice
☆62Oct 28, 2025Updated 8 months ago
Vyvo-Labs / CodecHub
View on GitHub
CodecHub: A Unified Library for Codec Models
☆25Dec 24, 2025Updated 7 months ago
yl4579 / DMOSpeech2
View on GitHub
☆302Jul 22, 2025Updated last year
herimor / voxtream
View on GitHub
VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control
☆245May 30, 2026Updated last month