FoundationVision / Autoregressive-Models-in-Vision-SurveyLinks

The paper collections for the autoregressive models in vision.

☆10

Alternatives and similar repositories for Autoregressive-Models-in-Vision-Survey

Users that are interested in Autoregressive-Models-in-Vision-Survey are comparing it to the libraries listed below

Sorting:

ziplab / SN-Netv2
[ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".
☆27Updated last year
philippe-eecs / small-vision
A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.
☆34Updated last year
ali-vilab / alitok
AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model
☆36Updated 2 weeks ago
lilijiangg / AutoDiffusion
☆45Updated last year
LINs-lab / GMem
[Preprint] GMem: A Modular Approach for Ultra-Efficient Generative Models
☆38Updated 4 months ago
DefengXie / Edit_Everything
☆19Updated 2 years ago
sail-sg / ScaleLong
The official repository of paper "ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection" (N…
☆50Updated last year
Gen-Verse / HermesFlow
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
☆63Updated 5 months ago
maple-research-lab / SIM
Inference-only implementation of "One-Step Diffusion Distillation through Score Implicit Matching" [NIPS 2024]
☆81Updated 8 months ago
lucasjinreal / LLaVA-Magvit2
LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.
☆37Updated last year
ariG23498 / TokenLearner
TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"
☆35Updated 3 years ago
OliverRensu / DeepMIM
[WACV2025 Oral] DeepMIM: Deep Supervision for Masked Image Modeling
☆53Updated 2 months ago
MengLcool / DeepStack-VL
[NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…
☆37Updated last year
viiika / HumanEdit
[CVPR 2025 AI4CC Workshop] Official Implementation of HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editin…
☆30Updated 2 months ago
OpenGVLab / DiffAgent
[CVPR 2024] DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model
☆17Updated last year
imagination-research / distilled-decoding
[ICLR 2025] Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching
☆47Updated 2 months ago
vvvvvjdy / SRA
(SRA) No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves
☆71Updated this week
mini-sora / MiniSora-DiT
minisora-DiT, a DiT reproduction based on XTuner from the open source community MiniSora
☆40Updated last year
deepglint / RealSyn
[ACM MM2025] The official repository for the RealSyn dataset
☆35Updated last week
360CVGroup / Inner-Adaptor-Architecture
LMM solved catastrophic forgetting, AAAI2025
☆44Updated 3 months ago
MCG-NJU / FlowDCN
[NeurIPS 2024] Exploring DCN-like Architectures for Fast Image Generation with Arbitrary Resolution
☆32Updated 6 months ago
ThisisBillhe / NAR
The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"
☆54Updated 3 months ago
pkulwj1994 / diff_instruct
official code for Diff-Instruct algorithm for one-step diffusion distillation
☆78Updated 6 months ago
weijiawu / Awesome-Synthetic-Data-for-Perception-Task
☆43Updated 2 years ago
feizc / Vespa
Video Diffusion State Space Models
☆19Updated last year
yliu-cs / PiTe
[ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model
☆16Updated 5 months ago
NVlabs / DDO
[ICML 2025 Spotlight] Direct Discriminative Optimization: Supercharging Diffusion/Autoregressive with GAN-type Discrimination
☆85Updated 3 weeks ago
weichow23 / AnySD
Official model implementation and benchmark evaluation repository of <AnyEdit: Unified High-Quality Image Edit with Any Idea>
☆24Updated 3 months ago
mightyzau / InfMLLM
☆19Updated last year
TencentARC / ViSFT
☆34Updated last year