naver-ai / clip4dm
Official PyTorch implementation of Extract Free Dense Misalignment from CLIP (AAAI'25)
☆21Updated last month
Alternatives and similar repositories for clip4dm:
Users that are interested in clip4dm are comparing it to the libraries listed below
- On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning, …☆16Updated 3 months ago
- [ECCV 2024] Official PyTorch implementation of "HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts"☆16Updated 4 months ago
- [ICLR 2023] RC-MAE☆51Updated last year
- ☆42Updated 2 weeks ago
- Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)☆56Updated last year
- ☆46Updated 11 months ago
- [AAAI 2025] ConVis: Contrastive Decoding with Hallucination Visualization for Mitigating Hallucinations in Multimodal Large Language Mode…☆15Updated 6 months ago
- ☆38Updated last year
- Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)☆57Updated 10 months ago
- [ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT☆55Updated 7 months ago
- ☆17Updated 2 weeks ago
- Official implementation of TCL (CVPR 2023)☆109Updated last year
- ☆37Updated 10 months ago
- [ECCV 2024] Official PyTorch implementation of LUT "Learning with Unmasked Tokens Drives Stronger Vision Learners"☆13Updated 3 months ago
- [ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion☆35Updated last month
- [AAAI2024] BOK-VQA : Bilingual Outside Knowledge-based Visual Question Answering via Graph Representation Pretraining☆1Updated 9 months ago
- ☆26Updated 4 months ago
- [2020.07-2021.07] 투빅스 14기 우수코드 저장소입니다.☆8Updated 4 years ago
- [ICLR 2025] VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning☆48Updated last month
- Awesome Vision-Language Compositionality, a comprehensive curation of research papers in literature.☆19Updated last month
- Official repository for the ICCV 2023 paper: "Waffling around for Performance: Visual Classification with Random Words and Broad Concepts…☆56Updated last year
- Official implementation for NeurIPS'23 paper "Geodesic Multi-Modal Mixup for Robust Fine-Tuning"☆32Updated 6 months ago
- ☆24Updated last year
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆27Updated last year
- Official code of "Generating Instance-level Prompts for Rehearsal-free Continual Learning (ICCV 2023)"☆42Updated last year
- [ACL 2024 Findings] Official PyTorch Implementation code for realizing the technical part of CoLLaVO: Crayon Large Language and Vision mO…☆95Updated 9 months ago
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆131Updated 8 months ago
- [AAAI-24] VVS : Video-to-Video Retrieval With Irrelevant Frame Suppression☆20Updated 10 months ago
- ☆23Updated last year
- [ECCV 2024] Official code for "Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation"☆18Updated 6 months ago