lucidrains / titok-pytorchLinks

Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"

☆181

Alternatives and similar repositories for titok-pytorch

Users that are interested in titok-pytorch are comparing it to the libraries listed below

Sorting:

lucidrains / multimodal-dit-pytorch
Implementation of a multimodal diffusion transformer in Pytorch
☆106Updated last year
lucidrains / maskbit-pytorch
Implementation of the proposed MaskBit from Bytedance AI
☆82Updated last year
apple / ml-flextok
FlexTok: Resampling Images into 1D Token Sequences of Flexible Length
☆270Updated 5 months ago
yinboc / dito
Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"
☆158Updated 9 months ago
lucidrains / LVMAE-pytorch
Implementation of the proposed LVMAE, from the paper, Extending Video Masked Autoencoders to 128 frames, in Pytorch
☆55Updated 11 months ago
OliverRensu / MVAR
☆71Updated last year
alexanderswerdlow / unidisc
UniDisc: A discrete diffusion model for joint multimodal generation, enabling controllable and efficient text-image synthesis, editing, a…
☆131Updated 7 months ago
Gengzigang / TokenSet
Official PyTorch implementation of TokenSet.
☆127Updated 7 months ago
zh460045050 / VQGAN-LC
☆139Updated last year
MCG-NJU / DDT
DDT: Decoupled Diffusion Transformer
☆317Updated 2 months ago
CompVis / tread
☆158Updated last month
hp-l33 / AiM
Official PyTorch Implementation of "Scalable Autoregressive Image Generation with Mamba"
☆141Updated 10 months ago
LINs-lab / UCGM
[Preprint] UCGM: Unified Continuous Generative Models
☆169Updated 5 months ago
philippe-eecs / small-vision
A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.
☆34Updated last year
lucidrains / h-net-dynamic-chunking
Implementation of the dynamic chunking mechanism in H-net by Hwang et al. of Carnegie Mellon
☆65Updated 3 months ago
tang-bd / fuse-dit
[CVPR 2025] Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
☆126Updated 6 months ago
causalfusion / causalfusion
☆183Updated 11 months ago
ShivamDuggal4 / adaptive-length-tokenizer
Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?
☆136Updated 9 months ago
facebookresearch / EvalGIM
🦾 EvalGIM (pronounced as "EvalGym") is an evaluation library for generative image models. It enables easy-to-use, reproducible automatic…
☆88Updated 10 months ago
visual-gen / semanticist
(ICCV 2025) "Principal Components" Enable A New Language of Images
☆72Updated 3 months ago
sangyun884 / rfpp
The codebase of our paper "Improving the Training of Rectified Flows", NeurIPS 2024
☆124Updated last year
chenllliang / DnD-Transformer
[ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…
☆77Updated 11 months ago
lucidrains / genie2-pytorch
Implementation of a framework for Genie2 in Pytorch
☆153Updated 10 months ago
NVlabs / DDO
[ICML 2025 Spotlight] Direct Discriminative Optimization: Supercharging Diffusion/Autoregressive with GAN-type Discrimination
☆104Updated 3 months ago
DAMO-NLP-SG / DiGIT
[NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
☆72Updated last year
feizc / DiS
Scalable Diffusion Models with State Space Backbone
☆156Updated last year
Neur-IO / OptVQ
Towards training VQ-VAE models robustly!
☆86Updated 4 months ago
lucidrains / mmdit
Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch
☆471Updated 10 months ago
qihao067 / CrossFlow
[CVPR2025] PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Noise-Free Framework for Cross-Mo…
☆318Updated 5 months ago
NVlabs / TokenBench
A Video Tokenizer Evaluation Dataset
☆138Updated 10 months ago