ytaek-oh / awesome-vl-compositionalityLinks
Awesome Vision-Language Compositionality, a comprehensive curation of research papers in literature.
☆24Updated 4 months ago
Alternatives and similar repositories for awesome-vl-compositionality
Users that are interested in awesome-vl-compositionality are comparing it to the libraries listed below
Sorting:
- [CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆50Updated 2 months ago
- [ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models☆30Updated 4 months ago
- ☆21Updated last year
- [CVPR 2024] Improving language-visual pretraining efficiency by perform cluster-based masking on images.☆28Updated last year
- [ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'☆21Updated 5 months ago
- [NeurIPS 2024] Code for Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models☆41Updated 3 months ago
- Implementation of "DIME-FM: DIstilling Multimodal and Efficient Foundation Models"☆15Updated last year
- [NeurIPS '24] Frustratingly easy Test-Time Adaptation of VLMs!!☆47Updated 3 months ago
- ☆12Updated 6 months ago
- [NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding☆43Updated 5 months ago
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆35Updated 11 months ago
- Code for "BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation [ICML2024]".☆42Updated last year
- [NeurIPS24] VisMin: Visual Minimal-Change Understanding☆15Updated 3 months ago
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆34Updated last year
- Distribution-Aware Prompt Tuning for Vision-Language Models (ICCV 2023)☆40Updated last year
- Official pytorch implementation of "RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in Large Vision Language…☆12Updated 6 months ago
- VisualGPTScore for visio-linguistic reasoning☆27Updated last year
- [ICLR 2025] VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning☆58Updated 4 months ago
- Official implementation of CVPR 2024 paper "Prompt Learning via Meta-Regularization".☆27Updated 3 months ago
- [ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion☆47Updated 2 months ago
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)☆45Updated 11 months ago
- 🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".☆37Updated 3 months ago
- This repository houses the code for the paper - "The Neglected of VLMs"☆28Updated last month
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆16Updated 8 months ago
- Code for "CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally"☆13Updated 4 months ago
- ☆24Updated 3 months ago
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆41Updated 6 months ago
- Hyperbolic Safety-Aware Vision-Language Models. CVPR 2025☆17Updated 2 months ago
- This is the official repository for paper: cross-modal information flow in multimodal large language models☆13Updated last month
- [CVPR 2023] Learning Attention as Disentangler for Compositional Zero-shot Learning☆39Updated last year