ali-vilab / CDTLinks
Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach
☆12Updated 3 months ago
Alternatives and similar repositories for CDT
Users that are interested in CDT are comparing it to the libraries listed below
Sorting:
- the official repo for "D-AR: Diffusion via Autoregressive Models"☆106Updated 3 weeks ago
- [CVPR 2025] HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation☆30Updated last week
- Official implementation of LaVin-DiT☆35Updated 5 months ago
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆73Updated 4 months ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆37Updated 5 months ago
- DDS: Delta Denoising Score PyTorch implementation☆19Updated last year
- PyTorch Implementation of "LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding"☆19Updated 4 months ago
- [ICLR2025] IV-Mixed Sampler: Leveraging Image Diffusion Models for Enhanced Video Synthesis☆33Updated 5 months ago
- HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation☆63Updated 5 months ago
- Codebase for the paper-Elucidating the design space of language models for image generation☆45Updated 8 months ago
- The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"☆54Updated 3 months ago
- TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation☆31Updated 7 months ago
- ☆26Updated 2 months ago
- TPDiff: Temporal Pyramid Video Diffusion Model☆20Updated 4 months ago
- Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"☆34Updated 2 weeks ago
- ☆39Updated last year
- Unifying Specialized Visual Encoders for Video Language Models☆21Updated 3 weeks ago
- ☆20Updated last month
- [Arxiv 2025] ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions☆32Updated last month
- [CVPR 2025 AI4CC Workshop] Official Implementation of HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editin…☆30Updated 2 months ago
- [ICLR 2025] Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding☆39Updated 2 months ago
- Sora Generates Videos with Stunning Geometrical Consistency☆51Updated last year
- Stable Consistency Tuning: Understanding and Improving Consistency models☆16Updated 8 months ago
- ☆10Updated last year
- Official PyTorch implementation of the paper "Equivariant Image Modeling"(https://arxiv.org/abs/2503.18948)☆33Updated 3 months ago
- Inference-only implementation of "One-Step Diffusion Distillation through Score Implicit Matching" [NIPS 2024]☆81Updated 8 months ago
- Official implementation of VRoPE: Rotary Position Embedding for Video Large Language Models.☆21Updated last month
- Video Diffusion State Space Models☆19Updated last year
- FlowZero: Zero-Shot Text-to-Video Synthesis with LLM-Driven Dynamic Scene Syntax☆18Updated last year
- Code for paper Background Prompting for Improved Object Depth☆29Updated last year