duchenzhuang/FSQ-pytorch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/duchenzhuang/FSQ-pytorch)

duchenzhuang / FSQ-pytorch

A Pytorch Implementation of Finite Scalar Quantization

☆190

Alternatives and similar repositories for FSQ-pytorch

Users that are interested in FSQ-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

bojone / FSQ
View on GitHub
Keras implement of Finite Scalar Quantization
☆87Oct 31, 2023Updated 2 years ago
lucidrains / vector-quantize-pytorch
View on GitHub
Vector (and Scalar) Quantization, in Pytorch
☆3,987Updated this week
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
youngsheen / SimVQ
View on GitHub
[ICCV 2025] SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
☆328Dec 29, 2024Updated last year
valeoai / Halton-MaskGIT
View on GitHub
[ICLR2025] Halton Scheduler for Masked Generative Image Transformer
☆286Oct 28, 2025Updated 8 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Aria-K-Alethia / BigCodec
View on GitHub
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
☆218Sep 19, 2024Updated last year
primepake / F5-TTS-meanflow-multilingual
View on GitHub
Meanflow and multilingual for F5-TTS model
☆16Aug 23, 2025Updated 11 months ago
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
Tencent-Hunyuan / iFSQ
View on GitHub
iFSQ & LlamaGen-REPA
☆102Jan 27, 2026Updated 5 months ago
tongdaxu / VQ-VAE-from-Gaussian-VAE
View on GitHub
Official implementation of (ICML 2026) Training-Free Vector Quantization via Gaussian VAEs
☆26Jan 3, 2026Updated 6 months ago
ZhangXInFD / SpeechTokenizer
View on GitHub
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…
☆658Jun 9, 2024Updated 2 years ago
Ereboas / MagiCodec
View on GitHub
A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.
☆125Jun 4, 2025Updated last year
zh460045050 / VQGAN-LC
View on GitHub
☆145Jun 28, 2024Updated 2 years ago
zhai-lw / SQCodec
View on GitHub
A lightweight audio codec based on a single quantizer
☆72Aug 15, 2025Updated 11 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
walker-hyf / GPT-Talker
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆78Nov 1, 2024Updated last year
zhaoyue-zephyrus / npq-vit
View on GitHub
[ICLR 2025] Binary Spherical Quantization + [CVPR 2026] Leech Spherical Quantization
☆221Dec 18, 2025Updated 7 months ago
ictnlp / SLED-TTS
View on GitHub
Streamable Text-to-Speech model using a language modeling approach, without vector quantization
☆108May 20, 2025Updated last year
yangdongchao / LLM-Codec
View on GitHub
The open source code for LLM-Codec
☆147Aug 18, 2024Updated last year
KdaiP / DC-Speech-VAE
View on GitHub
5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs
☆57Nov 19, 2025Updated 8 months ago
zhenye234 / xcodec
View on GitHub
AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
☆308Oct 12, 2025Updated 9 months ago
YuqingWang1029 / TokenBridge
View on GitHub
[ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/To…
☆158Jul 24, 2025Updated last year
IDEA-Emdoor-Lab / DistilCodec
View on GitHub
A Neural Audio Codec (NAC) for Universal Audio
☆47May 30, 2025Updated last year
ZhikangNiu / Semantic-VAE
View on GitHub
[INTERSPEECH 2026 Oral]Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
☆121Jun 21, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
modelscope / FunCodec
View on GitHub
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music gener…
☆445Jan 25, 2024Updated 2 years ago
xingchensong / S3Tokenizer
View on GitHub
Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
☆521Dec 22, 2025Updated 7 months ago
lucidrains / magvit2-pytorch
View on GitHub
Implementation of MagViT2 Tokenizer in Pytorch
☆668Jan 12, 2025Updated last year
Stability-AI / stable-codec
View on GitHub
A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.
☆437Jul 17, 2026Updated last week
LTH14 / mar
View on GitHub
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
☆1,943Feb 20, 2026Updated 5 months ago
voidful / Codec-SUPERB
View on GitHub
Audio Codec Speech processing Universal PERformance Benchmark
☆308Jul 4, 2026Updated 3 weeks ago
yzGuu830 / efficient-speech-codec
View on GitHub
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
☆126Mar 20, 2025Updated last year
turingmotors / One-D-Piece
View on GitHub
[ICML 2025 Tokshop] One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression
☆83Jul 30, 2025Updated 11 months ago
Tikai7 / DiTTO-TTS
View on GitHub
DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors
☆39Feb 11, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Jiang-Yidi / UniCodec
View on GitHub
[ACL 2025 Main] UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and s…
☆157May 30, 2025Updated last year
yl4579 / PL-BERT
View on GitHub
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
☆270Jan 13, 2025Updated last year
orangeduck / interact-retarget
View on GitHub
Retargeting of the InterAct dataset onto a common skeleton
☆26Sep 16, 2025Updated 10 months ago
ai-forever / MoVQGAN
View on GitHub
MoVQGAN - model for the image encoding and reconstruction
☆266Oct 31, 2023Updated 2 years ago
ZhikangNiu / A-DMA
View on GitHub
[INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"
☆67Jun 16, 2025Updated last year
LqNoob / Neural-Codec-and-Speech-Language-Models
View on GitHub
Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models
☆246Jul 9, 2026Updated 2 weeks ago
Oliver-Cong02 / UMO
View on GitHub
☆28Mar 18, 2026Updated 4 months ago