FineCLIP: Self-distilled Region-based CLIP for Better Fine-grained Understanding (NIPS24)
☆34Nov 12, 2025Updated 3 months ago
Alternatives and similar repositories for FineCLIP
Users that are interested in FineCLIP are comparing it to the libraries listed below
Sorting:
- ☆10Jul 5, 2024Updated last year
- [ACL Main 2025] I0T: Embedding Standardization Method Towards Zero Modality Gap☆12Jun 18, 2025Updated 8 months ago
- [CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆55Apr 7, 2025Updated 10 months ago
- We present **FOCI**, a benchmark for Fine-grained Object ClassIfication for large vision language models (LVLMs).☆19Jun 21, 2024Updated last year
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆21Oct 8, 2024Updated last year
- [NeurIPS24] VisMin: Visual Minimal-Change Understanding☆19Mar 3, 2025Updated 11 months ago
- ☆14Oct 31, 2022Updated 3 years ago
- An official implementation of "GOAL⚽: Global-local Object Alignment Learning" (CVPR 2025).☆26Aug 14, 2025Updated 6 months ago
- ☆17Nov 15, 2022Updated 3 years ago
- Up-to-date Vision Language Models collection. Mainly focus on computer vision☆19Feb 9, 2023Updated 3 years ago
- Pytorch implementation for "Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning" (ICML 2024)☆24May 11, 2025Updated 9 months ago
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆55Aug 16, 2024Updated last year
- ☆54Jan 17, 2025Updated last year
- ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs☆28Aug 15, 2025Updated 6 months ago
- Awesome Vision-Language Compositionality, a comprehensive curation of research papers in literature.☆34Feb 13, 2025Updated last year
- Some papers about *diverse* image (a few videos) captioning☆26Apr 4, 2023Updated 2 years ago
- Official code for Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions (CVPR 2024)☆28Jun 21, 2024Updated last year
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆37Aug 18, 2024Updated last year
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆27Nov 29, 2023Updated 2 years ago
- ☆29Jun 10, 2024Updated last year
- ☆36Nov 4, 2022Updated 3 years ago
- [ICML 2024] SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning☆32Sep 30, 2024Updated last year
- Evaluation of semi-supervised learning on challenging datasets☆38Dec 21, 2021Updated 4 years ago
- NegCLIP.☆39Feb 6, 2023Updated 3 years ago
- Data repository for the VALSE benchmark.☆37Feb 15, 2024Updated 2 years ago
- [NeurIPS 2023] Code base for the Renyi Kernel Entropy (RKE) metric for generative models.☆13Jun 18, 2025Updated 8 months ago
- Source code for the paper "Memory-Efficient Fine-Tuning via Low-Rank Activation Compression"☆13Aug 1, 2025Updated 6 months ago
- [NeurIPS 2023] Generalized Logit Adjustment☆39Apr 21, 2024Updated last year
- Repository for the paper: Teaching Structured Vision & Language Concepts to Vision & Language Models☆48Sep 25, 2023Updated 2 years ago
- ☆10Apr 7, 2025Updated 10 months ago
- ☆13Dec 2, 2024Updated last year
- Official implementation of `Discovering Hidden Visual Concepts Beyond Linguistic Input in Infant Learning`, CVPR 2025☆13Aug 1, 2025Updated 6 months ago
- [CVPR2025] Official code for Lost in Translation Found in Context☆23Jan 14, 2026Updated last month
- Improving Continuous Sign Language Recognition with Adapted Image Models☆14Nov 10, 2025Updated 3 months ago
- The Chongqing University Bituminous Pavement Disease Detection Dataset (CQU-BPDD)☆13Apr 17, 2025Updated 10 months ago
- Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model☆13Feb 15, 2024Updated 2 years ago
- Repository for the paper 'Medical diffusion on a budget: textual inversion for medical image generation'☆12Dec 11, 2024Updated last year
- PyTorch使用技巧和教程☆11Apr 17, 2023Updated 2 years ago
- [AAAI'25 Oral] NightReID: A Large-Scale Nighttime Person Re-Identification Benchmark☆10Jun 10, 2025Updated 8 months ago