trungpx / xmdptLinks
ICML 2024, Official Implementation of "Cross-view Masked Diffusion Transformers for Person Image Synthesis."
☆29Updated 8 months ago
Alternatives and similar repositories for xmdpt
Users that are interested in xmdpt are comparing it to the libraries listed below
Sorting:
- [ECCV 2024] FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing☆35Updated last week
- ☆25Updated 4 months ago
- ☆25Updated 4 months ago
- [ECCV'24] Official code for "BI-MDRG: Bridging Image History in Multimodal Dialogue Response Generation"☆16Updated 8 months ago
- ☆17Updated 8 months ago
- [ICLR 2025] SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image and Video Generation☆42Updated 5 months ago
- This repository is the official implementation of the paper: Physics Informed Distillation for Diffusion Models, accepted by Transactions…☆27Updated 6 months ago
- Causal Localization Network for Radar Human Localization with micro-Doppler signature☆23Updated 9 months ago
- Winning SubNetwork (WSN), Fourier Subneural Operator (FSO), Video-Incremental Learning (VIL), Sequential Neural Implicit Representation (…☆25Updated 8 months ago
- [ICLR'23] ESD: Expected Squared Difference as a Tuning-Free Trainable Calibration Measure☆16Updated last year
- [ECCV 2024] Official repository for "DataDream: Few-shot Guided Dataset Generation"☆41Updated 11 months ago
- FRAG: Frequency Adaptive Group for Diffusion Video Editing (ICML 2024)☆34Updated 10 months ago
- Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)☆38Updated 2 months ago
- ☆27Updated 7 months ago
- ☆51Updated 4 months ago
- [ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion☆49Updated 2 months ago
- [ECCV’24] Official repository for "BEAF: Observing Before-AFter Changes to Evaluate Hallucination in Vision-language Models"☆20Updated 3 months ago
- Official code for CVPR 2024 paper: Discriminative Probing and Tuning for Text-to-Image Generation☆33Updated 3 months ago
- (arXiv.2405.18406) RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives☆36Updated 8 months ago
- [ICLR 2024] Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models.☆83Updated 11 months ago
- Diffusion-TTA improves pre-trained discriminative models such as image classifiers or segmentors using pre-trained generative models.☆74Updated last year
- Code and data for the paper "Emergent Visual-Semantic Hierarchies in Image-Text Representations" (ECCV 2024)☆28Updated 11 months ago
- Awesome Vision-Language Compositionality, a comprehensive curation of research papers in literature.☆25Updated 5 months ago
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆41Updated 7 months ago
- Official implementation of TCL (CVPR 2023)☆114Updated 2 years ago
- [NeurIPS2024]☆25Updated 7 months ago
- Code for the paper Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models @ CVPR 2024☆66Updated last year
- ☆59Updated last year
- Question-Aware Gaussian Experts for Audio-Visual Question Answering -- Official Pytorch Implementation (CVPR'25, Highlight)☆17Updated last month
- Code implementation of our ICCV 2025 paper: On Large Multimodal Models as Open-World Image Classifiers☆22Updated last week