Neur-IO/ReVQ

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Neur-IO/ReVQ)

Neur-IO / ReVQ

Explore how to get a VQ-VAE models efficiently!

☆69

Alternatives and similar repositories for ReVQ

Users that are interested in ReVQ are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zhuangshaobin / WeTok
View on GitHub
[ICLR2026] WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction
☆69Sep 3, 2025Updated 10 months ago
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
Mddct / usm-tokenizer
View on GitHub
semantic tokenizer for speech and music
☆20Jul 6, 2025Updated last year
YuqingWang1029 / TokenBridge
View on GitHub
[ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/To…
☆158Jul 24, 2025Updated 11 months ago
Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆30Apr 2, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
yinboc / dito
View on GitHub
Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"
☆169Jan 31, 2025Updated last year
dc-ai-projects / DC-AR
View on GitHub
☆83Oct 18, 2025Updated 9 months ago
Mddct / cosyvoice2-flow-optimized
View on GitHub
faster inference
☆27Jan 20, 2025Updated last year
colaudiolab / AudioSet-R
View on GitHub
Official implementation: "AudioSet-R: A Refined AudioSet with Multi-Stage LLM Label Reannotation"
☆19Oct 9, 2025Updated 9 months ago
YangXusheng-yxs / CodecFormer_5Hz
View on GitHub
☆35Oct 23, 2025Updated 8 months ago
scottishfold0621 / ACMID
View on GitHub
☆26Apr 30, 2026Updated 2 months ago
FoundationVision / UniTok
View on GitHub
[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding
☆529Nov 14, 2025Updated 8 months ago
ZhengrongYue / UniFlow
View on GitHub
Official Implementation of "UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation"
☆143Oct 17, 2025Updated 9 months ago
Hhhhhhao / continuous_tokenizer
View on GitHub
☆321May 29, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
pengzhendong / audiolab
View on GitHub
A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)
☆39Mar 31, 2026Updated 3 months ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
YuqingWang1029 / PAR
View on GitHub
[CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-project
☆186Mar 20, 2025Updated last year
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
pengzhendong / ngram-punctuator
View on GitHub
An N-gram punctuator for Chinese and English.
☆18Oct 14, 2025Updated 9 months ago
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated last year
apple / ml-flextok
View on GitHub
FlexTok: Resampling Images into 1D Token Sequences of Flexible Length
☆322Jun 2, 2025Updated last year
fluxions-ai / stftvae
View on GitHub
Inference for the STFT-VAE continuous audio codec (24kHz, 3.125Hz latent)
☆43Jul 12, 2026Updated last week
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
SilentView / GigaTok
View on GitHub
[ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"
☆204Jan 7, 2026Updated 6 months ago
CompVis / discrete-interpolants
View on GitHub
The official implementation of "[MASK] is All You Need"
☆127Jul 23, 2025Updated 11 months ago
westlake-repl / LeanVAE
View on GitHub
[ICCV2025]LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models
☆111Updated this week
asigalov61 / Euterpe-X
View on GitHub
[DEPRECIATED] [PyTorch 2.0] [638M] [85.33% acc] Full-attention multi-instrumental music transformer for supervised music generation, opti…
☆33Nov 23, 2023Updated 2 years ago
OliverRensu / xAR
View on GitHub
This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generat…
☆251Oct 12, 2025Updated 9 months ago
theAdamColton / vq-clip
View on GitHub
Train vector quantized CLIP models using pytorch lightning
☆21Jul 14, 2024Updated 2 years ago
ByteVisionLab / TokenFlow
View on GitHub
[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
☆464Aug 8, 2025Updated 11 months ago
DAMO-NLP-SG / DiGIT
View on GitHub
[NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
☆78Oct 31, 2024Updated last year
NVlabs / QLIP
View on GitHub
[arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation
☆97Mar 1, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆54May 1, 2025Updated last year
feizc / Vespa
View on GitHub
Video Diffusion State Space Models
☆19Mar 27, 2024Updated 2 years ago
yuhuUSTC / FAR
View on GitHub
Frequency Autoregressive Image Generation with Continuous Tokens
☆101Jun 9, 2025Updated last year
zelaki / eqvae
View on GitHub
[ICML'25] EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling.
☆181Mar 18, 2026Updated 4 months ago
ali-vilab / alitok
View on GitHub
[ICLR2026] AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model
☆56Oct 12, 2025Updated 9 months ago
lxa9867 / ImageFolder
View on GitHub
High-performance Image Tokenizers for VAR and AR
☆307Apr 25, 2025Updated last year
merlresearch / sebbs
View on GitHub
Prediction of sound event bounding boxes (SEBBs)
☆35Aug 2, 2024Updated last year