CVPR 2025 Accepted Papers
☆24Dec 20, 2025Updated 4 months ago
Alternatives and similar repositories for Mask2DiT
Users that are interested in Mask2DiT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability☆17May 8, 2025Updated 11 months ago
- ☆18Mar 21, 2025Updated last year
- Phantom-Data: Towards a General Subject-Consistent Video Generation Dataset☆107Feb 25, 2026Updated 2 months ago
- ☆29Sep 4, 2025Updated 7 months ago
- ☆21Jun 3, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [CVPR 2025] PoseTraj: Pose-Aware Trajectory Control in Video Diffusion☆21Oct 11, 2025Updated 6 months ago
- ☆19Apr 16, 2025Updated last year
- ☆100Nov 6, 2025Updated 5 months ago
- 中科大跨模态智能组-每周论文分享☆16Nov 20, 2022Updated 3 years ago
- Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark☆28Apr 22, 2025Updated last year
- Video Diffusion Transformers are In-Context Learners☆36Jan 6, 2025Updated last year
- UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation☆46Aug 26, 2025Updated 8 months ago
- Pytorch implementation for Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation.☆18Jan 4, 2022Updated 4 years ago
- OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models☆156Mar 4, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Balanced Classification: A Unified Framework for Long-Tailed Object Detection (TMM 2023)☆102Apr 18, 2025Updated last year
- Consistent Autoregressive Video Generation with Long Context☆81Feb 6, 2026Updated 2 months ago
- 第三届华为云无人车挑战杯复赛Top1方案分享, Traffic sign detection, yolov4, mindspore☆14Aug 26, 2021Updated 4 years ago
- ☆108Jan 6, 2026Updated 3 months ago
- ☆10Feb 16, 2022Updated 4 years ago
- ☆13Jul 10, 2024Updated last year
- Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection☆56Aug 16, 2025Updated 8 months ago
- Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency☆62Jun 6, 2025Updated 10 months ago
- [CVPR 2026] FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning☆52Mar 26, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [WACV 2025] Cross-Task Affinity Learning for Multitask Dense Scene Predictions☆11Jun 12, 2025Updated 10 months ago
- ☆14Feb 16, 2022Updated 4 years ago
- PANDA大场景多对象检测跟踪(初赛检测)开源代码,初赛排名13☆13Jul 17, 2021Updated 4 years ago
- ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting☆90Feb 11, 2023Updated 3 years ago
- Official code for CustAny: Customizing Anything from A Single Example. Accepted by CVPR2025 (Oral)☆47Apr 10, 2025Updated last year
- Animate Any Character in Any World☆97Mar 10, 2026Updated last month
- “计图”算法挑战赛-狗细分类 4/430☆10Apr 26, 2021Updated 5 years ago
- [CVPR 2024] Official implementation of "DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations"☆279Jul 5, 2025Updated 9 months ago
- [CVPR 2025] Decision SpikeFormer: Spike-Driven Transformer for Decision Making☆19Aug 8, 2025Updated 8 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [IJCAI-2024] The official code of Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition☆10Aug 10, 2025Updated 8 months ago
- DreamStyle: A Unified Framework for Video Stylization☆119Jan 7, 2026Updated 3 months ago
- UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs (WWW'25)☆18Apr 22, 2025Updated last year
- [MM'22 Oral] AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation☆11Apr 3, 2023Updated 3 years ago
- ☆13Jul 24, 2017Updated 8 years ago
- [CVPR 2025] "DepthCues: Evaluating Monocular Depth Perception in Large Vision Models", Duolikun Danier, Mehmet Aygün, Changjian Li, Hakan…☆23Mar 17, 2025Updated last year
- [ECCV 2024] Official code repository of paper titled "Efficient 3D-Aware Facial Image Editing Via Attribute-Specific Prompt Learning"☆10Aug 2, 2024Updated last year