chenllliang/DnD-Transformer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/chenllliang/DnD-Transformer)

chenllliang / DnD-Transformer

[ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation"

☆80

Alternatives and similar repositories for DnD-Transformer

Users that are interested in DnD-Transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

chenllliang / ATP-AMR
View on GitHub
Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022
☆15Mar 31, 2023Updated 3 years ago
chenllliang / MLS
View on GitHub
Source code of our paper "Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation" @ACL-2022
☆18May 19, 2022Updated 4 years ago
chenllliang / MMEvalPro
View on GitHub
[NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs
☆25Sep 26, 2024Updated last year
Bostoncake / C-VisDiT
View on GitHub
The official implementation of our ICCV 2023 publication, C-VisDiT
☆10Oct 23, 2024Updated last year
pkunlp-icler / PCA-EVAL
View on GitHub
[ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
☆107Mar 14, 2024Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
pkunlp-icler / SCL-RAI
View on GitHub
Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER" @COLING-2022
☆11Aug 20, 2022Updated 3 years ago
Yifan-Song793 / GoodBadGreedy
View on GitHub
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism
☆31Jul 17, 2024Updated 2 years ago
poppuppy / SAR
View on GitHub
☆34Dec 29, 2025Updated 6 months ago
THU-MIG / PYRA
View on GitHub
The official implementation of our ECCV 2024 publication, PYRA (Parallel Yielding Re-Activation).
☆22Dec 19, 2025Updated 7 months ago
lxa9867 / ImageFolder
View on GitHub
High-performance Image Tokenizers for VAR and AR
☆307Apr 25, 2025Updated last year
ByungKwanLee / Phantom
View on GitHub
[Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …
☆63Oct 9, 2024Updated last year
MonoFormer / MonoFormer
View on GitHub
The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"
☆92Oct 12, 2024Updated last year
SHI-Labs / VisPer-LM
View on GitHub
[NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
☆73Oct 17, 2025Updated 9 months ago
DAMO-NLP-SG / DiGIT
View on GitHub
[NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
☆78Oct 31, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
TencentARC / SEED-Voken
View on GitHub
SEED-Voken: A Series of Powerful Visual Tokenizers
☆1,018Nov 25, 2025Updated 8 months ago
mhh0318 / OneMoreStep
View on GitHub
☆25Nov 30, 2023Updated 2 years ago
mit-han-lab / hart
View on GitHub
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
☆647Oct 16, 2024Updated last year
NVlabs / QLIP
View on GitHub
[arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation
☆97Mar 1, 2025Updated last year
formll / resolving-scaling-law-discrepancies
View on GitHub
☆19Nov 4, 2025Updated 8 months ago
Pepper-lll / LMforImageGeneration
View on GitHub
Codebase for the paper-Elucidating the design space of language models for image generation
☆45Nov 17, 2024Updated last year
LMM101 / Awesome-Multimodal-Next-Token-Prediction
View on GitHub
[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
☆476Jan 17, 2025Updated last year
Seeing-Fast-and-Slow / Seeing-Fast-and-Slow
View on GitHub
☆16May 28, 2026Updated last month
Jialuo-Li / Science-T2I
View on GitHub
[CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis
☆62Mar 31, 2026Updated 3 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
rongyaofang / PUMA
View on GitHub
Empowering Unified MLLM with Multi-granular Visual Generation
☆132Jan 16, 2025Updated last year
FoundationVision / UniTok
View on GitHub
[NeurIPS 2025 Spotlight] A Unified Tokenizer for Visual Generation and Understanding
☆529Nov 14, 2025Updated 8 months ago
SilentView / GigaTok
View on GitHub
[ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"
☆204Jan 7, 2026Updated 6 months ago
mit-han-lab / vila-u
View on GitHub
[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
☆425Apr 25, 2025Updated last year
wdrink / SimpleAR
View on GitHub
Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"
☆431Jun 20, 2025Updated last year
Beckschen / LLaVolta
View on GitHub
[NeurIPS 2024] Efficient Large Multi-modal Models via Visual Context Compression
☆66Feb 19, 2025Updated last year
Neur-IO / OptVQ
View on GitHub
Towards training VQ-VAE models robustly!
☆95Jul 14, 2025Updated last year
YuqingWang1029 / PAR
View on GitHub
[CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-project
☆186Mar 20, 2025Updated last year
KbsdJames / Omni-MATH
View on GitHub
The official repository of the Omni-MATH benchmark.
☆94Dec 22, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
DCDmllm / MorphTokens
View on GitHub
☆43May 6, 2024Updated 2 years ago
neu-vi / FleVRS
View on GitHub
FleVRS: Towards Flexible Visual Relationship Segmentation, NeurIPS 2024
☆22Dec 9, 2024Updated last year
causalfusion / causalfusion
View on GitHub
☆196Dec 17, 2024Updated last year
hywang66 / LARP
View on GitHub
Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).
☆107Feb 11, 2025Updated last year
lancopku / clip-openness
View on GitHub
[ACL 2023] Delving into the Openness of CLIP
☆24Jan 11, 2023Updated 3 years ago
Alpha-VLLM / Lumina-mGPT
View on GitHub
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…
☆646Oct 16, 2025Updated 9 months ago
chenllliang / G1
View on GitHub
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
☆103May 20, 2025Updated last year