PKU-ICST-MIPL / MAI_ICLR2025Links
☆14Updated 4 months ago
Alternatives and similar repositories for MAI_ICLR2025
Users that are interested in MAI_ICLR2025 are comparing it to the libraries listed below
Sorting:
- Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"☆29Updated 3 months ago
- [CVPR 2025 Highlight] Official Pytorch codebase for paper: "Assessing and Learning Alignment of Unimodal Vision and Language Models"☆45Updated last month
- Official implementation of TagAlign☆35Updated 7 months ago
- [CVPR25] CoLLM: A Large Language Model for Composed Image Retrieval☆25Updated 3 months ago
- Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".☆41Updated 10 months ago
- ☆32Updated last year
- ☆23Updated 2 years ago
- ☆19Updated 8 months ago
- Official repository of "Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach" (ACL 2024 Oral)☆29Updated 3 months ago
- [CBMI2024 Best Paper] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".☆27Updated 2 months ago
- ICLR‘24 Offical Implementation of Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization☆72Updated last year
- The official implementation for BLIP4CIR with bi-directional training | Bi-directional Training for Composed Image Retrieval via Text Pro…☆31Updated last year
- GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)☆69Updated last year
- Official code for paper "Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models, ICML2024"☆24Updated 5 months ago
- FreeVA: Offline MLLM as Training-Free Video Assistant☆59Updated last year
- Code for ECCV 2022 Workshop paper "See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval"☆22Updated 2 weeks ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆67Updated 9 months ago
- Learnable Pillar-based Re-ranking for Image-Text Retrieval. SIGIR'23☆20Updated last year
- [ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding☆60Updated 8 months ago
- ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single Model☆15Updated last month
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆56Updated 8 months ago
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆41Updated 7 months ago
- [ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"☆69Updated last year
- [CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception☆65Updated last month
- Implementation of our paper, Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination..☆19Updated last year
- [EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"☆20Updated 8 months ago
- [NeurIPS 2024] OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling.☆20Updated 4 months ago
- [CVPR 2025] Official PyTorch Code for "MMRL: Multi-Modal Representation Learning for Vision-Language Models" and its extension "MMRL++: P…☆57Updated 3 weeks ago
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆39Updated 4 months ago
- [AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation☆29Updated last year