Video Diffusion Transformers are In-Context Learners
☆35Jan 6, 2025Updated last year
Alternatives and similar repositories for Video-In-Context
Users that are interested in Video-In-Context are comparing it to the libraries listed below
Sorting:
- ☆18Mar 21, 2025Updated 11 months ago
- Blending Custom Photos with Video Diffusion Transformers☆48Jan 21, 2025Updated last year
- [CVPR2025] Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation☆18May 2, 2025Updated 9 months ago
- TPDiff: Temporal Pyramid Video Diffusion Model☆23Mar 13, 2025Updated 11 months ago
- Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening☆69May 18, 2025Updated 9 months ago
- (CVPR 2025) Scailing Down Text Encoders of Text-to-Image Diffusion Models☆52Sep 10, 2025Updated 5 months ago
- Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation☆23Jul 30, 2025Updated 7 months ago
- [CVPR 2024] BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation☆45May 7, 2024Updated last year
- Exposing Text-Image Inconsistency Using Diffusion Models (ICLR 2024)☆10Jun 15, 2024Updated last year
- [ICLR 2025] Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception☆14Jul 4, 2025Updated 7 months ago
- CVPR 2025 Accepted Papers☆23Dec 20, 2025Updated 2 months ago
- ☆13Jul 10, 2024Updated last year
- Responsible Visual Editing☆15Jul 10, 2024Updated last year
- EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing☆30Mar 29, 2024Updated last year
- ☆81Oct 13, 2025Updated 4 months ago
- ☆16May 13, 2025Updated 9 months ago
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆35Mar 12, 2024Updated last year
- ☆15Mar 30, 2025Updated 11 months ago
- ☆13Mar 8, 2024Updated last year
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆98Feb 11, 2025Updated last year
- [CVPR 2025] Official code of "DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Long…☆321Mar 30, 2025Updated 11 months ago
- Code for "SePPO: Semi-Policy Preference Optimization for Diffusion Alignment."☆18Oct 7, 2024Updated last year
- [ICCV 2025] The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation☆21Oct 12, 2025Updated 4 months ago
- Official PyTorch implementation - Video Motion Transfer with Diffusion Transformers☆78Jul 29, 2025Updated 7 months ago
- FlowZero: Zero-Shot Text-to-Video Synthesis with LLM-Driven Dynamic Scene Syntax☆18Nov 23, 2023Updated 2 years ago
- ☆16Feb 23, 2025Updated last year
- ☆43May 30, 2025Updated 9 months ago
- Benchmark dataset and code of MSRVTT-Personalization☆52Nov 10, 2025Updated 3 months ago
- [ECCV 2024] Official code for: SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer☆112Jun 30, 2025Updated 8 months ago
- [NeurIPS'25 Spotlight] MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation☆20Feb 23, 2025Updated last year
- VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning☆59Nov 4, 2025Updated 3 months ago
- The official implementation of 'GRID: Visual Layout Generation.'☆21Dec 28, 2024Updated last year
- Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection☆55Aug 16, 2025Updated 6 months ago
- ☆20Jan 1, 2026Updated last month
- ☆20Feb 9, 2026Updated 2 weeks ago
- Official code for CustAny: Customizing Anything from A Single Example. Accepted by CVPR2025 (Oral)☆48Apr 10, 2025Updated 10 months ago
- Official code for VINCIE: Unlocking In-context Image Editing from Video☆48Sep 8, 2025Updated 5 months ago
- ☆17Feb 20, 2025Updated last year
- ☆22Mar 7, 2025Updated 11 months ago