ali-vilab/CDT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ali-vilab/CDT)

ali-vilab / CDT

Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach

☆17

Alternatives and similar repositories for CDT

Users that are interested in CDT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

edward3862 / Analogist
View on GitHub
Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model (SIGGRAPH 2024)
☆38Sep 10, 2024Updated last year
black-yt / ReaLS
View on GitHub
Exploring Representation-Aligned Latent Space for Better Generation
☆19Mar 17, 2026Updated 4 months ago
LanDiff / LanDiff
View on GitHub
The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation
☆41May 4, 2025Updated last year
zai-org / SSVAE
View on GitHub
official implementation of the paper "Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability".
☆71Dec 25, 2025Updated 6 months ago
jingyu198 / Hyper3D
View on GitHub
☆17Jun 29, 2026Updated 3 weeks ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
qiuk2 / RobusTok
View on GitHub
Image Tokenizer Needs Post-Training
☆24Oct 4, 2025Updated 9 months ago
Eniac-Xie / FuseTeacher
View on GitHub
☆12Nov 26, 2024Updated last year
TingtingLiao / unique3d-diffusion
View on GitHub
☆46Sep 27, 2024Updated last year
ZhenglinZhou / Zero-1-to-A
View on GitHub
[CVPR 2025] Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion
☆43Mar 21, 2025Updated last year
seanywang0408 / AudioEar
View on GitHub
Official code of AAAI'23 paper AudioEar: Single-View Ear Reconstruction for Personalized Spatial Audio written in PyTorch
☆44Dec 14, 2023Updated 2 years ago
XinyaChen21 / TeFF
View on GitHub
☆22Sep 26, 2024Updated last year
inclusionAI / TC-AE
View on GitHub
Official repo for "TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders"
☆24Apr 9, 2026Updated 3 months ago
Zhangyr2022 / D3QE
View on GitHub
[ICCV 2025] D^3QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection
☆17Jul 11, 2026Updated last week
Rishit-dagli / Squeeze3D
View on GitHub
Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural Compressor
☆23Jun 12, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
yangxiaofeng / LODS
View on GitHub
Official code for ECCV 2024 paper: Learn to Optimize Denoising Scores A Unified and Improved Diffusion Prior for 3D Generation
☆72Jul 11, 2024Updated 2 years ago
CASIA-IVA-Lab / VRoPE
View on GitHub
[EMNLP 2025 Main] Official implementation of VRoPE: Rotary Position Embedding for Video Large Language Models.
☆28Nov 18, 2025Updated 8 months ago
Yanwen-W / TeRA
View on GitHub
[ICCV 2025] TeRA: Rethinking Text-guided Realistic 3D Avatar Generation
☆19Sep 13, 2025Updated 10 months ago
sig22virtualbones / VirtualBonesDataset
View on GitHub
☆20Apr 20, 2022Updated 4 years ago
microsoft / Reducio-VAE
View on GitHub
☆217Feb 11, 2025Updated last year
markywg / transagent
View on GitHub
[NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration
☆25Oct 17, 2024Updated last year
WeichenFan / UAE
View on GitHub
Official repo for UAE
☆206Jun 21, 2026Updated last month
MCG-NJU / PixNerd
View on GitHub
[ICLR 2026] PixNerd: Pixel Neural Field Diffusion
☆182Dec 10, 2025Updated 7 months ago
XZYW7 / StyleTex
View on GitHub
☆20Mar 31, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SilentView / GigaTok
View on GitHub
[ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"
☆204Jan 7, 2026Updated 6 months ago
zjysteven / controlnet_tile
View on GitHub
Workable training script for ControlNet tile
☆35May 2, 2024Updated 2 years ago
ryunuri / POP3D
View on GitHub
[SIGGRAPH Asia 2023] Official pytorch implementation of "360° Reconstruction From a Single Image Using Space Carved Outpainting"
☆17Sep 15, 2023Updated 2 years ago
zeng-yifei / AvatarBooth
View on GitHub
Official implementation of “AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation”
☆57Jun 4, 2024Updated 2 years ago
huang-yh / Owl
View on GitHub
☆52Dec 13, 2024Updated last year
adobe-research / ImageFolder
View on GitHub
☆20Dec 8, 2024Updated last year
lavinal712 / control-lora-v3
View on GitHub
☆11Dec 15, 2025Updated 7 months ago
HumanEval-V / HumanEval-V-Benchmark
View on GitHub
A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks
☆15Feb 25, 2025Updated last year
ZTX-100 / DLA-Combined-IoUs
View on GitHub
☆13Aug 9, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
EmmaSRH / ARVFM
View on GitHub
Awesome autoregressive vision foundation models
☆26Dec 24, 2024Updated last year
maxcrous / SIFT
View on GitHub
A vectorized implementation of Lowe's Scale Invariant Feature Transform.
☆10Dec 2, 2021Updated 4 years ago
Chenliang-Zhou / CLIP-PAE
View on GitHub
Projection-augmentation embedding for CLIP-based latent manipulation methods
☆25Feb 2, 2026Updated 5 months ago
YS-IMTech / HyperDreamer
View on GitHub
(Siggraph Asia 2023) Project Page of "HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image"
☆10Dec 9, 2023Updated 2 years ago
facebookresearch / dualformer
View on GitHub
implementation of dualformer
☆25Mar 1, 2025Updated last year
irom-princeton / perception-guarantees
View on GitHub
Code for combining generalization guarantees for perception and planning.
☆14Jun 23, 2025Updated last year
pit30m / pit30m
View on GitHub
The Python SDK for the Pit30M large scale visual localization dataset.
☆19Jan 4, 2025Updated last year