ChaofanTao / Autoregressive-Models-in-Vision-SurveyLinks

[TMLR 2025🔥] A survey for the autoregressive models in vision.

☆725

Alternatives and similar repositories for Autoregressive-Models-in-Vision-Survey

Users that are interested in Autoregressive-Models-in-Vision-Survey are comparing it to the libraries listed below

Sorting:

lxa9867 / Awesome-Autoregressive-Visual-Generation
This is a repo to track the latest autoregressive visual generation papers.
☆405Updated 4 months ago
showlab / Awesome-Unified-Multimodal-Models
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
☆725Updated 2 weeks ago
AIDC-AI / Awesome-Unified-Multimodal-Models
Awesome Unified Multimodal Models
☆823Updated 2 months ago
ByteVisionLab / TokenFlow
[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
☆395Updated 2 months ago
AlonzoLeeeooo / awesome-video-generation
A collection of awesome video generation studies.
☆659Updated last week
CodeGoat24 / UnifiedReward
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think
☆574Updated last week
baaivision / NOVA
[ICLR 2025] Autoregressive Video Generation without Vector Quantization
☆581Updated last month
YingqingHe / Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
☆512Updated 6 months ago
ziqihuangg / Awesome-Evaluation-of-Visual-Generation
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
☆368Updated last month
Purshow / Awesome-Unified-Multimodal
📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.
☆319Updated last week
mayuelala / Awesome-Controllable-Video-Generation
[ArXiv 2025] A survey about controllable video generation: This repo is the official awesome of "Controllable video generation: A survey…
☆517Updated last week
bytedance / 1d-tokenizer
This repo contains the code for 1D tokenizer and generator
☆1,061Updated 7 months ago
wdrink / SimpleAR
Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"
☆411Updated 4 months ago
JunyaoHu / common_metrics_on_video_quality
You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.
☆480Updated 9 months ago
yifan123 / flow_grpo
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
☆1,463Updated last week
TencentARC / SEED-Voken
SEED-Voken: A Series of Powerful Visual Tokenizers
☆957Updated this week
hustvl / LightningDiT
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
☆1,222Updated 4 months ago
xie-lab-ml / awesome-alignment-of-diffusion-models
The collection of awesome papers on alignment of diffusion models.
☆348Updated this week
mit-han-lab / vila-u
[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
☆398Updated 6 months ago
lxa9867 / ImageFolder
High-performance Image Tokenizers for VAR and AR
☆292Updated 6 months ago
xuyang-liu16 / Awesome-Generation-Acceleration
📚 Collection of awesome generation acceleration resources.
☆356Updated 3 months ago
yzhang2016 / video-generation-survey
A reading list of video generation
☆628Updated this week
bytetriper / RAE
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
☆1,311Updated last week
rongyaofang / GoT
Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"
☆291Updated last month
FoundationVision / UniTok
[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding
☆425Updated last month
showlab / Show-o
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
☆1,751Updated this week
sihyun-yu / REPA
[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
☆1,382Updated 7 months ago
SiatMMLab / Awesome-Diffusion-Model-Based-Image-Editing-Methods
Diffusion Model-Based Image Editing: A Survey (TPAMI 2025)
☆675Updated 3 months ago
selftok-team / SelftokTokenizer
Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning
☆224Updated 4 months ago
daixiangzi / VAR-CLIP
Implements VAR+CLIP for text-to-image (T2I) generation
☆146Updated 9 months ago