Official Implementation of Attentive Mask CLIP (ICCV2023, https://arxiv.org/abs/2212.08653)
☆35May 29, 2024Updated last year
Alternatives and similar repositories for A-CLIP
Users that are interested in A-CLIP are comparing it to the libraries listed below
Sorting:
- CLIP-MoE: Mixture of Experts for CLIP☆55Oct 10, 2024Updated last year
- Official implementation of the paper "LTrack: Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Rep…☆12Jul 26, 2023Updated 2 years ago
- AlignCLIP: Improving Cross-Modal Alignment in CLIP (ICLR 2025)☆58Mar 1, 2025Updated 11 months ago
- MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments☆13Jul 8, 2024Updated last year
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated last year
- ☆30Mar 2, 2023Updated 2 years ago
- [ICLR 2025] Official code repository for "TULIP: Token-length Upgraded CLIP"☆33Jan 26, 2026Updated last month
- ☆33Nov 4, 2024Updated last year
- A simple pytorch implementation of baseline based-on CLIP for Image-text Matching.☆19May 25, 2023Updated 2 years ago
- [ICML'25] Kernel-based Unsupervised Embedding Alignment for Enhanced Visual Representation in Vision-language Models☆21Sep 7, 2025Updated 5 months ago
- The official code for paper "Stacking Brick by Brick: Aligned Feature Isolation for Incremental Face Forgery Detection" (CVPR 2025)☆25Aug 15, 2025Updated 6 months ago
- Official implementation of TagAlign☆35Dec 11, 2024Updated last year
- ☆19Mar 24, 2025Updated 11 months ago
- Baseline and template code for node21 generation track☆11Feb 1, 2022Updated 4 years ago
- ☆16Sep 29, 2024Updated last year
- 华南理工大学本科毕业论文模板☆16May 29, 2023Updated 2 years ago
- ☆20Apr 23, 2024Updated last year
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆46Dec 1, 2024Updated last year
- ☆19Jan 5, 2024Updated 2 years ago
- [NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training☆227Mar 20, 2025Updated 11 months ago
- Official code for the paper "Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-…☆21May 11, 2025Updated 9 months ago
- ☆22Nov 4, 2024Updated last year
- [ICCV 2023] ViLLA: Fine-grained vision-language representation learning from real-world data☆46Oct 15, 2023Updated 2 years ago
- Official repo of M$^2$PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning☆27Mar 23, 2025Updated 11 months ago
- ☆37Jan 12, 2026Updated last month
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆138May 8, 2025Updated 9 months ago
- Can 3D Vision-Language Models Truly Understand Natural Language?☆20Mar 28, 2024Updated last year
- spatio-temporal tasks☆16Jul 15, 2024Updated last year
- [EMNLP 2025 Findings] 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation☆30Jun 12, 2025Updated 8 months ago
- Generative Multi-modal Models are Good Class Incremental Learners, CVPR 2024 [PyTorch Code]☆49Nov 21, 2024Updated last year
- Detail-Oriented CLIP for Fine-Grained Tasks (ICLR SSI-FM 2025)☆57Mar 26, 2025Updated 11 months ago
- [CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆55Apr 7, 2025Updated 10 months ago
- ☆24Sep 25, 2024Updated last year
- RayGen: Multi-Modal Dataset Reinforcement for MobileCLIP and MobileCLIP2☆39Aug 29, 2025Updated 6 months ago
- Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs☆26Jan 14, 2025Updated last year
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆32Mar 26, 2025Updated 11 months ago
- An official PyTorch implementation for CLIPPR☆30Jul 22, 2023Updated 2 years ago
- [CBMI 2024 Best Paper] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".☆32May 12, 2025Updated 9 months ago
- ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs☆28Aug 15, 2025Updated 6 months ago