HKUST-LongGroup / Diff-IILinks
[CVPR 2025] PyTorch implementation of Diff-II
☆14Updated 3 months ago
Alternatives and similar repositories for Diff-II
Users that are interested in Diff-II are comparing it to the libraries listed below
Sorting:
- A curated list of publications on image and video segmentation leveraging Multimodal Large Language Models (MLLMs), highlighting state-of…☆97Updated this week
- [ICCV-2023] The official code of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation☆128Updated 5 months ago
- [CVPR 2025] RAP: Retrieval-Augmented Personalization☆58Updated last week
- [ICLR 2025] Diffusion Feedback Helps CLIP See Better☆280Updated 5 months ago
- Official code of SmartEdit [CVPR-2024 Highlight]☆340Updated last year
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆131Updated last month
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆177Updated 3 weeks ago
- 🔥CVPR 2025 Multimodal Large Language Models Paper List☆144Updated 3 months ago
- The official implementation of A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation☆18Updated 6 months ago
- ☆22Updated 5 months ago
- [ECCV 2024] Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models☆49Updated 11 months ago
- [CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories☆54Updated 3 months ago
- Implements VAR+CLIP for text-to-image (T2I) generation☆139Updated 5 months ago
- Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval [CVPR 2025 Highlight]☆50Updated 2 weeks ago
- [CVPRW 2025] UniToken is an auto-regressive generation model that combines discrete and continuous representations to process visual inpu…☆86Updated 2 months ago
- Official PyTorch Code for "ATPrompt: Textual Prompt Learning with Embedded Attributes"☆38Updated 6 months ago
- ☆24Updated last year
- [CVPR 2025 🔥]A Large Multimodal Model for Pixel-Level Visual Grounding in Videos☆73Updated 2 months ago
- [ECCV2024]The official implementation of the DiffPNG paper in PyTorch.☆12Updated 8 months ago
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆122Updated 5 months ago
- [CVPR2025] FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression☆42Updated 3 months ago
- Official implementation for "Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter"☆40Updated last year
- CAR: Controllable AutoRegressive Modeling for Visual Generation☆120Updated 6 months ago
- Official code for "DiffX: Guide Your Layout to Cross-Modal Generative Modeling"☆22Updated 4 months ago
- Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"☆39Updated 4 months ago
- Official code for CVPR 2024 paper: Discriminative Probing and Tuning for Text-to-Image Generation☆32Updated 2 months ago
- New generation of CLIP with fine grained discrimination capability, ICML2025☆200Updated last month
- Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning☆49Updated last month
- This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentati…☆70Updated last year
- ☆64Updated 2 months ago