zhaoshitian / Causal-CoGLinks
[CVPR'24 Highlight] Implementation of "Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models"
☆15Updated last year
Alternatives and similar repositories for Causal-CoG
Users that are interested in Causal-CoG are comparing it to the libraries listed below
Sorting:
- [Paper][AAAI2024]Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations☆152Updated last year
- [CVPR 2024] FairCLIP: Harnessing Fairness in Vision-Language Learning☆97Updated 6 months ago
- Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)☆73Updated 11 months ago
- [CVPR 2025 Highlight] Official Pytorch codebase for paper: "Assessing and Learning Alignment of Unimodal Vision and Language Models"☆56Updated 5 months ago
- [COLING'25] HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding☆44Updated last year
- ☆49Updated 11 months ago
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆92Updated last year
- Pytorch Implementation for CVPR 2024 paper: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation☆56Updated 4 months ago
- ☆64Updated 3 months ago
- Code repository for "Post-pre-training for Modality Alignment in Vision-Language Foundation Models" (CVPR2025)☆37Updated 5 months ago
- ☆17Updated 2 years ago
- [NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models☆49Updated last year
- The official pytorch implemention of our CVPR-2024 paper "MMA: Multi-Modal Adapter for Vision-Language Models".☆95Updated 8 months ago
- [CVPR 2025] COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training☆38Updated 9 months ago
- ☆25Updated 4 months ago
- Source code of our AAAI 2024 paper "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval"☆53Updated last year
- AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation☆49Updated last week
- [ECCV 2024] Official Implementation of "OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding"☆59Updated 6 months ago
- Official Implementation of "CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning" on MIC…☆17Updated 11 months ago
- [CVPR 2024] The official pytorch implementation of "A General and Efficient Training for Transformer via Token Expansion".☆47Updated last year
- [ICCV'23 Main Track, WECIA'23 Oral] Official repository of paper titled "Self-regulating Prompts: Foundational Model Adaptation without F…☆281Updated 2 years ago
- FineCLIP: Self-distilled Region-based CLIP for Better Fine-grained Understanding (NIPS24)☆33Updated 2 months ago
- A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CV…☆78Updated 5 months ago
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆39Updated last year
- [ICLR2023] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models☆174Updated 2 years ago
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆204Updated 6 months ago
- [AAAI 2024] TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP Without Training☆105Updated 2 years ago
- [ICML'25] MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization☆66Updated 7 months ago
- [NeurIPS 2023] Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization☆110Updated last year
- [ 🎯 NeurIPS 2025 ] 3D-RAD 🩻: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks☆27Updated 2 months ago