ZhangXinWhut/SimWhisper-Codec

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZhangXinWhut/SimWhisper-Codec)

ZhangXinWhut / SimWhisper-Codec

Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"

☆37

Alternatives and similar repositories for SimWhisper-Codec

Users that are interested in SimWhisper-Codec are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

YangXusheng-yxs / CodecFormer_5Hz
View on GitHub
☆35Oct 23, 2025Updated 8 months ago
Soul-AILab / SAC
View on GitHub
[ACL 2026 Main] Training, inference, and testing of the SAC speech codec model.
☆108Nov 1, 2025Updated 8 months ago
RayYuki / CodecBench
View on GitHub
☆24Nov 16, 2025Updated 8 months ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
yangdongchao / ALMTokenizer
View on GitHub
The demo page for ALMTokenizer
☆59Apr 14, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
gyt1145028706 / XY-Tokenizer
View on GitHub
This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs
☆96Sep 19, 2025Updated 10 months ago
ASLP-lab / M7-TTS
View on GitHub
M7-TTS: A Mini-Scale Multilingual and Multi-Dialect Text-to-Speech Language Model with Mimi codec and Multi Token Prediction
☆20Mar 19, 2026Updated 4 months ago
lonzi / mrflow_dpo
View on GitHub
☆22Jan 3, 2026Updated 6 months ago
AmphionTeam / TaDiCodec
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆77Jan 25, 2026Updated 5 months ago
KdaiP / DC-Speech-VAE
View on GitHub
5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs
☆57Nov 19, 2025Updated 8 months ago
aask1357 / hilcodec
View on GitHub
High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec
☆120Jun 23, 2025Updated last year
Ereboas / MagiCodec
View on GitHub
A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.
☆124Jun 4, 2025Updated last year
ZhikangNiu / Semantic-VAE
View on GitHub
[INTERSPEECH 2026 Oral]Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
☆120Jun 21, 2026Updated 3 weeks ago
fluxions-ai / stftvae
View on GitHub
Inference for the STFT-VAE continuous audio codec (24kHz, 3.125Hz latent)
☆43Jul 12, 2026Updated last week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
SWivid / AUV
View on GitHub
An All-in-One Speech, Sound, Music Codec with Single Nested Codebook
☆28Oct 11, 2025Updated 9 months ago
y-ren16 / OV-InstructTTS
View on GitHub
☆22Jan 27, 2026Updated 5 months ago
bovod-sjtu / HoliTok
View on GitHub
HoliTok:A Coutinuous Holistic Tokenization with Robust Dual Capabilities of Speech Generation and Understanding
☆36Jun 8, 2026Updated last month
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
k2-fsa / Flow2GAN
View on GitHub
Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation
☆144Mar 8, 2026Updated 4 months ago
seongho608 / RingFormer
View on GitHub
☆52Jun 24, 2025Updated last year
xiaomi-research / dasheng-tokenizer
View on GitHub
State-of-the-art continious audio tokenization
☆40Mar 9, 2026Updated 4 months ago
yanghaha0908 / WavCube
View on GitHub
Official code for "WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling"
☆62Jun 27, 2026Updated 3 weeks ago
zeyuxie29 / SemanticVocoder
View on GitHub
☆28Apr 6, 2026Updated 3 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
jiaqili3 / DualCodec
View on GitHub
[Interspeech 2025] DualCodec: A Low-Frame-Rate, Semantically-Enhanced Neural Audio Codec
☆72Mar 11, 2026Updated 4 months ago
zhai-lw / L3AC
View on GitHub
A lightweight audio codec based on a single quantizer
☆35Sep 4, 2025Updated 10 months ago
tabahi / contexless-phonemes-CUPE
View on GitHub
pytorch model for contexless-phoneme prediction from speech audio
☆32Oct 30, 2025Updated 8 months ago
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆54May 1, 2025Updated last year
bfs18 / armel
View on GitHub
poorman's ar-dit tts
☆45Dec 31, 2025Updated 6 months ago
exercise-book-yq / Supercodec
View on GitHub
☆51Mar 5, 2026Updated 4 months ago
xiquan-li / MeanAudio
View on GitHub
[ACL 2026 Main] MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows
☆142Sep 2, 2025Updated 10 months ago
hs-oh-prml / DurFlexEVC
View on GitHub
☆81Jan 22, 2025Updated last year
zhai-lw / SQCodec
View on GitHub
A lightweight audio codec based on a single quantizer
☆72Aug 15, 2025Updated 11 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
youngsheen / GPST
View on GitHub
[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer
☆70Nov 1, 2024Updated last year
malradhi / PACodec
View on GitHub
[ICASSP 2026]Official code for "Prosody-Guided Harmonic Attention for Phase-Coherent Neural Vocoding in the Complex Spectrum"
☆27Jan 22, 2026Updated 5 months ago
ictnlp / SLED-TTS
View on GitHub
Streamable Text-to-Speech model using a language modeling approach, without vector quantization
☆108May 20, 2025Updated last year
AI-S2-Lab / GPT-Talker
View on GitHub
[ACMMM'2024] Generative Expressive Conversational Speech Synthesis
☆45Oct 28, 2024Updated last year
AmphionTeam / FlexiCodec
View on GitHub
[ICLR2026] FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates
☆50Jul 1, 2026Updated 2 weeks ago
disco-speech / DisCo-Speech
View on GitHub
☆90Dec 31, 2025Updated 6 months ago
HeCheng0625 / Diffusion-Speech-Tokenizer
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆198Jan 25, 2026Updated 5 months ago