CVPR 2025 Accepted Papers
☆26Dec 20, 2025Updated 5 months ago
Alternatives and similar repositories for Mask2DiT
Users that are interested in Mask2DiT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability☆17May 8, 2025Updated last year
- ☆18Mar 21, 2025Updated last year
- Phantom-Data: Towards a General Subject-Consistent Video Generation Dataset☆110Feb 25, 2026Updated 3 months ago
- ☆29Sep 4, 2025Updated 9 months ago
- ☆21Jun 3, 2023Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [CVPR 2025] PoseTraj: Pose-Aware Trajectory Control in Video Diffusion☆22May 26, 2026Updated last week
- ☆19Apr 16, 2025Updated last year
- ☆98Nov 6, 2025Updated 7 months ago
- 中科大跨模态智能组-每周论文分享☆15Nov 20, 2022Updated 3 years ago
- Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark☆28Apr 22, 2025Updated last year
- Video Diffusion Transformers are In-Context Learners☆37Jan 6, 2025Updated last year
- UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation☆48Aug 26, 2025Updated 9 months ago
- Pytorch implementation for Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation.☆18Jan 4, 2022Updated 4 years ago
- OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models☆159Mar 4, 2026Updated 3 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Balanced Classification: A Unified Framework for Long-Tailed Object Detection (TMM 2023)☆101Apr 18, 2025Updated last year
- Consistent Autoregressive Video Generation with Long Context☆88Feb 6, 2026Updated 4 months ago
- 第三届华为云无人车挑战杯复赛Top1方案分享, Traffic sign detection, yolov4, mindspore☆14Aug 26, 2021Updated 4 years ago
- ☆109Jan 6, 2026Updated 5 months ago
- ☆10Feb 16, 2022Updated 4 years ago
- ☆13Jul 10, 2024Updated last year
- Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection☆56Aug 16, 2025Updated 9 months ago
- Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency☆62Jun 6, 2025Updated last year
- [WACV 2025] Cross-Task Affinity Learning for Multitask Dense Scene Predictions☆11Jun 12, 2025Updated 11 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- [CVPR 2026] FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning☆56Mar 26, 2026Updated 2 months ago
- ☆14Feb 16, 2022Updated 4 years ago
- PANDA大场景多对象检测跟踪(初赛检测)开源代码,初赛排名13☆13Jul 17, 2021Updated 4 years ago
- ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting☆90Feb 11, 2023Updated 3 years ago
- Official code for CustAny: Customizing Anything from A Single Example. Accepted by CVPR2025 (Oral)☆47Apr 10, 2025Updated last year
- [arXiv 2512.17796] Animate Any Character in Any World☆96Mar 10, 2026Updated 2 months ago
- “计图”算法挑战赛-狗细分类 4/430☆10Apr 26, 2021Updated 5 years ago
- [CVPR 2024] Official implementation of "DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations"☆280Jul 5, 2025Updated 11 months ago
- [CVPR 2025] Decision SpikeFormer: Spike-Driven Transformer for Decision Making☆19Aug 8, 2025Updated 10 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [IJCAI-2024] The official code of Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition☆10Aug 10, 2025Updated 9 months ago
- DreamStyle: A Unified Framework for Video Stylization☆119Jan 7, 2026Updated 5 months ago
- UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs (WWW'25)☆18Apr 22, 2025Updated last year
- ☆27Aug 9, 2025Updated 10 months ago
- [MM'22 Oral] AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation☆11Apr 3, 2023Updated 3 years ago
- [CVPR 2025] "DepthCues: Evaluating Monocular Depth Perception in Large Vision Models", Duolikun Danier, Mehmet Aygün, Changjian Li, Hakan…☆23Mar 17, 2025Updated last year
- ☆13Jul 24, 2017Updated 8 years ago