huawei-lin/VTBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/huawei-lin/VTBench)

huawei-lin / VTBench

This repository provides the official implementation of VTBench, a benchmark designed to evaluate the performance of visual tokenizers (VTs) in the context of autoregressive (AR) image generation.

☆35

Alternatives and similar repositories for VTBench

Users that are interested in VTBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

huawei-lin / Agent-Omni
View on GitHub
The official implementation for the paper "Agent-Omni: Test-Time Multimodal Reasoning via Model Coordination for Understanding Anything".
☆23Nov 5, 2025Updated 8 months ago
YuqingWang1029 / TokenBridge
View on GitHub
[ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/To…
☆158Jul 24, 2025Updated last year
ali-vilab / alitok
View on GitHub
[ICLR2026] AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model
☆56Oct 12, 2025Updated 9 months ago
asomoza / mellon-modular-diffusers
View on GitHub
☆11May 14, 2025Updated last year
MajorDavidZhang / Generalization_unified_VLM
View on GitHub
☆24May 23, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
wjf5203 / TokBench
View on GitHub
Image and video Tokenizer/VAE selection guide, text and face reconstruction evaluation.
☆152Jun 11, 2026Updated last month
showlab / D-AR
View on GitHub
the official repo for "D-AR: Diffusion via Autoregressive Models"
☆138Jan 29, 2026Updated 5 months ago
ShivamDuggal4 / UNITE-tokenization-generation
View on GitHub
Single-stage End-to-End Training for Tokenization and Generation
☆117Mar 24, 2026Updated 4 months ago
csuhan / Tar
View on GitHub
[NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
☆202Sep 18, 2025Updated 10 months ago
3DAgentWorld / VisAnything
View on GitHub
☆30Sep 20, 2024Updated last year
X-Omni-Team / X-Omni
View on GitHub
Official inference code and LongText-Bench benchmark for our paper X-Omni (https://arxiv.org/pdf/2507.22058).
☆426Aug 26, 2025Updated 10 months ago
RenShuhuai-Andy / NBP
View on GitHub
Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling
☆42Feb 12, 2025Updated last year
Singularity0104 / NExT-Vid
View on GitHub
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations
☆22Dec 24, 2025Updated 7 months ago
selftok-team / SelftokTokenizer
View on GitHub
Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning
☆238May 30, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
wdrink / ARM
View on GitHub
ARM: An AutoRegressive Large Multimodal Model with Discrete Representations
☆50Jun 10, 2026Updated last month
zelaki / eqvae
View on GitHub
[ICML'25] EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling.
☆181Mar 18, 2026Updated 4 months ago
Tencent / HaploVLM
View on GitHub
ICML2025
☆63Aug 28, 2025Updated 10 months ago
SilentView / GigaTok
View on GitHub
[ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"
☆204Jan 7, 2026Updated 6 months ago
apple / ml-atoken
View on GitHub
☆145Nov 8, 2025Updated 8 months ago
showlab / UniRL
View on GitHub
The code repository of UniRL
☆53May 30, 2025Updated last year
YuqingWang1029 / CubiD
View on GitHub
[CVPR2026 Highlight] Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens https://arxiv.org/abs…
☆63Apr 10, 2026Updated 3 months ago
Franklin-Zhang0 / ReasonGen-R1
View on GitHub
Official respository for ReasonGen-R1
☆75Jun 23, 2025Updated last year
apple / ml-flextok
View on GitHub
FlexTok: Resampling Images into 1D Token Sequences of Flexible Length
☆322Jun 2, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
thuzhaowang / MonoPlane
View on GitHub
Code release of our IROS 2024 paper "MonoPlane: Exploiting Monocular Geometric Cues for Generalizable 3D Plane Reconstruction"
☆19Nov 5, 2024Updated last year
EthanG97 / ImageDoctor
View on GitHub
The official implementation for "ImageDoctor: Diagnosing Text-to-Image Generation via Grounded Image Reasoning"
☆15Mar 1, 2026Updated 4 months ago
Open-Model-Initiative / imagegen-speedrun
View on GitHub
We bring the spirit of nanogpt-speedrun into the omni-modal world
☆15Jan 31, 2026Updated 5 months ago
GigaAI-research / WonderFree
View on GitHub
☆19Jun 26, 2025Updated last year
ZhengrongYue / UniFlow
View on GitHub
Official Implementation of "UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation"
☆143Oct 17, 2025Updated 9 months ago
OpenVE-Team / OpenVE-3M
View on GitHub
OpenVE-3M: A Large-Scale High-Quality Dataset for Instruction-Guided Video Editing
☆51Apr 15, 2026Updated 3 months ago
tang-bd / v-grpo
View on GitHub
[CVPR 2026 Findings] V-GRPO: Online Reinforcement Learning for Denoising Generative Models Is Easier than You Think
☆56Apr 28, 2026Updated 2 months ago
zhuangshaobin / WeTok
View on GitHub
[ICLR2026] WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction
☆69Sep 3, 2025Updated 10 months ago
inclusionAI / Ming-UniVision
View on GitHub
Code release for Ming-UniVision: Joint Image Understanding and Geneation with a Continuous Unified Tokenizer
☆143Oct 14, 2025Updated 9 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
HorizonWind2004 / reconstruction-alignment
View on GitHub
[ICLR 2026] Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potenti…
☆410May 23, 2026Updated 2 months ago
huangrh99 / AlphaGRPO
View on GitHub
[ICML2026] Official Implementation of AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in Unified Multimodal Models via Decompo…
☆73Jul 14, 2026Updated last week
FoundationVision / UniTok
View on GitHub
[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding
☆529Nov 14, 2025Updated 8 months ago
limbo0000 / mtm
View on GitHub
Official implementation of MTM
☆21Aug 30, 2023Updated 2 years ago
wangf3014 / VTok
View on GitHub
Official implementation of VTok: A Unified Video Tokenizer with Decoupled Spatial-Temporal Latents
☆15Feb 5, 2026Updated 5 months ago
flying-sky999 / OmniV2V
View on GitHub
☆15Jun 2, 2025Updated last year
kylesargent / FlowMo
View on GitHub
Official PyTorch implementation of FlowMo.
☆117Apr 7, 2025Updated last year