VachanVY / Transfusion.torch

PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

☆17

Alternatives and similar repositories for Transfusion.torch:

Users that are interested in Transfusion.torch are comparing it to the libraries listed below

shim0114 / SSM-Meets-Video-Diffusion-Models
☆43Updated 4 months ago
cfifty / rotation_trick
☆67Updated 3 months ago
jacklishufan / OmniFlows
The official implementation of OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
☆50Updated 3 weeks ago
causalfusion / causalfusion
☆133Updated last month
CompVis / mask
The official implementation of "[MASK] is All You Need"
☆104Updated last month
FoundationVision / vaex
🔥stable, simple, state-of-the-art VQVAE toolkit & cookbook
☆74Updated 7 months ago
lucidrains / maskbit-pytorch
Implementation of the proposed MaskBit from Bytedance AI
☆71Updated 2 months ago
DAMO-NLP-SG / DiGIT
[NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
☆59Updated 3 months ago
MonoFormer / MonoFormer
The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"
☆81Updated 3 months ago
lxa9867 / Awesome-Autoregressive-Visual-Generation
This is a repo to track the latest autoregressive visual generation papers.
☆119Updated last week
duchenzhuang / FSQ-pytorch
A Pytorch Implementation of Finite Scalar Quantization
☆104Updated last year
Epiphqny / PAR
The official implementation of PAR: Parallelized Autoregressive Visual Generation. https://epiphqny.github.io/PAR-project/
☆108Updated 3 weeks ago
lucidrains / multimodal-dit-pytorch
Implementation of a multimodal diffusion transformer in Pytorch
☆99Updated 7 months ago
sangyun884 / rfpp
The codebase of our paper "Improving the Training of Rectified Flows", NeurIPS 2024
☆92Updated 3 months ago
THUDM / VisionReward
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
☆117Updated last week
PKU-YuanGroup / WF-VAE
Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
☆113Updated this week
feizc / Dimba
Transformer-Mamba Diffusion Models
☆95Updated 7 months ago
zh460045050 / VQGAN-LC
☆112Updated 7 months ago
ruocwang / dpo-diffusion
[ICML 2024] On Discrete Prompt Optimization for Diffusion Models - Google
☆45Updated 5 months ago
wangyuchi369 / LaDiC
[NAACL 2024] LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-text Generation?
☆37Updated 7 months ago
czg1225 / CoDe
CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient
☆75Updated last week
xie-lab-ml / awesome-alignment-of-diffusion-models
The collection of awesome papers on alignment of diffusion models.
☆84Updated last week
lxa9867 / ImageFolder
XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation
☆182Updated last week
lxa9867 / ControlVAR
This is the official implementation for ControlVAR.
☆91Updated last month
feizc / DiS
Scalable Diffusion Models with State Space Backbone
☆150Updated 10 months ago
G-U-N / Rectified-Diffusion
[ICLR 2025] Rectified Diffusion: Straightness Is Not Your Need
☆165Updated last month
philippe-eecs / vitok
☆23Updated 2 weeks ago
1202kbs / GCTM
Official PyTorch implementation of "Generalized Consistency Trajectory Models for Image Manipulation"
☆34Updated 10 months ago
kyegomez / ViTAR
Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch
☆30Updated 2 months ago
FoundationVision / Liquid
Liquid: Language Models are Scalable Multi-modal Generators
☆61Updated last month