CVPR 2025 Accepted Papers
☆25Dec 20, 2025Updated 4 months ago
Alternatives and similar repositories for Mask2DiT
Users that are interested in Mask2DiT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability☆17May 8, 2025Updated last year
- ☆18Mar 21, 2025Updated last year
- Phantom-Data: Towards a General Subject-Consistent Video Generation Dataset☆109Feb 25, 2026Updated 2 months ago
- ☆29Sep 4, 2025Updated 8 months ago
- ☆21Jun 3, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [CVPR 2025] PoseTraj: Pose-Aware Trajectory Control in Video Diffusion☆22Oct 11, 2025Updated 7 months ago
- ☆19Apr 16, 2025Updated last year
- ☆99Nov 6, 2025Updated 6 months ago
- 中科大跨模态智能组-每周论文分享☆15Nov 20, 2022Updated 3 years ago
- Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark☆28Apr 22, 2025Updated last year
- Video Diffusion Transformers are In-Context Learners☆36Jan 6, 2025Updated last year
- UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation☆47Aug 26, 2025Updated 8 months ago
- Pytorch implementation for Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation.☆18Jan 4, 2022Updated 4 years ago
- OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models☆160Mar 4, 2026Updated 2 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Balanced Classification: A Unified Framework for Long-Tailed Object Detection (TMM 2023)☆102Apr 18, 2025Updated last year
- Consistent Autoregressive Video Generation with Long Context☆82Feb 6, 2026Updated 3 months ago
- 第三届华为云无人车挑战杯复赛Top1方案分享, Traffic sign detection, yolov4, mindspore☆14Aug 26, 2021Updated 4 years ago
- ☆108Jan 6, 2026Updated 4 months ago
- ☆10Feb 16, 2022Updated 4 years ago
- ☆13Jul 10, 2024Updated last year
- Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection☆56Aug 16, 2025Updated 9 months ago
- Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency☆62Jun 6, 2025Updated 11 months ago
- [CVPR 2026] FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning☆54Mar 26, 2026Updated last month
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [WACV 2025] Cross-Task Affinity Learning for Multitask Dense Scene Predictions☆11Jun 12, 2025Updated 11 months ago
- ☆14Feb 16, 2022Updated 4 years ago
- PANDA大场景多对象检测跟踪(初赛检测)开源代码,初赛排名13☆13Jul 17, 2021Updated 4 years ago
- ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting☆90Feb 11, 2023Updated 3 years ago
- Official code for CustAny: Customizing Anything from A Single Example. Accepted by CVPR2025 (Oral)☆47Apr 10, 2025Updated last year
- Animate Any Character in Any World☆97Mar 10, 2026Updated 2 months ago
- “计图”算法挑战赛-狗细分类 4/430☆10Apr 26, 2021Updated 5 years ago
- [CVPR 2024] Official implementation of "DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations"☆280Jul 5, 2025Updated 10 months ago
- [CVPR 2025] Decision SpikeFormer: Spike-Driven Transformer for Decision Making☆19Aug 8, 2025Updated 9 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [IJCAI-2024] The official code of Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition☆10Aug 10, 2025Updated 9 months ago
- DreamStyle: A Unified Framework for Video Stylization☆119Jan 7, 2026Updated 4 months ago
- UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs (WWW'25)☆18Apr 22, 2025Updated last year
- ☆25Aug 9, 2025Updated 9 months ago
- [MM'22 Oral] AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation☆11Apr 3, 2023Updated 3 years ago
- [CVPR 2025] "DepthCues: Evaluating Monocular Depth Perception in Large Vision Models", Duolikun Danier, Mehmet Aygün, Changjian Li, Hakan…☆23Mar 17, 2025Updated last year
- ☆13Jul 24, 2017Updated 8 years ago