Kanade is a single-layer disentangled speech tokenizer that extracts compact tokens suitable for both generative and discriminative modeling.
☆78Feb 3, 2026Updated last month
Alternatives and similar repositories for kanade-tokenizer
Users that are interested in kanade-tokenizer are comparing it to the libraries listed below
Sorting:
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 3 months ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- pytorch model for contexless-phoneme prediction from speech audio☆32Oct 30, 2025Updated 4 months ago
- ☆15Nov 10, 2025Updated 3 months ago
- ☆28Nov 15, 2023Updated 2 years ago
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆153Jan 27, 2026Updated last month
- Official repo for DisCoder: High-Fidelity Music Vocoder using Neural Audio Codecs presented at ICASSP 2025☆38Feb 24, 2025Updated last year
- ☆10Dec 22, 2023Updated 2 years ago
- [INTERSPEECH 2024] Official code for VoxSim: A perceptual voice similarity dataset☆12Sep 29, 2025Updated 5 months ago
- Testing sets for semanticVAD☆20Feb 18, 2025Updated last year
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆28Sep 20, 2025Updated 5 months ago
- Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"☆33Jan 28, 2026Updated last month
- ☆13Mar 11, 2025Updated 11 months ago
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- Official repository of Wavehax vocoder☆66Dec 20, 2025Updated 2 months ago
- A TTS Trained on Universal Audio.☆41Jun 6, 2025Updated 8 months ago
- Streaming Vocos☆30Jun 10, 2025Updated 8 months ago
- A neural speech codec based on discrete WavLM representations☆24Aug 28, 2024Updated last year
- Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report☆49Sep 2, 2025Updated 6 months ago
- ☆30Jan 22, 2026Updated last month
- Fast and differentiable hidden Markov model in C++☆19Jan 20, 2023Updated 3 years ago
- A lightweight audio codec based on a single quantizer☆69Aug 15, 2025Updated 6 months ago
- A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis☆44Jul 24, 2023Updated 2 years ago
- [ICASSP 2025] "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"☆108Jan 17, 2025Updated last year
- ☆40Jul 15, 2025Updated 7 months ago
- The demo page for ALMTokenizer☆59Apr 14, 2025Updated 10 months ago
- VITS with phoneme-level prosody modeling based on MaskGIT☆85Aug 31, 2024Updated last year
- The Multi-band Excited WaveNet☆15Feb 2, 2023Updated 3 years ago
- ☆17Oct 16, 2018Updated 7 years ago
- Crawling and creating a German language model resource☆18Aug 23, 2022Updated 3 years ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆68Nov 1, 2024Updated last year
- ☆49Apr 1, 2025Updated 11 months ago
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆77Dec 3, 2025Updated 2 months ago
- Self-supervised Generative LM-based Voice Conversion☆54Apr 24, 2025Updated 10 months ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- ☆21Mar 7, 2025Updated 11 months ago
- ☆13Sep 12, 2024Updated last year
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆24Aug 1, 2025Updated 7 months ago