zhaoshitian / Causal-CoG
[CVPR'24 Highlight] Implementation of "Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models"
☆11Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for Causal-CoG
- Visual self-questioning for large vision-language assistant.☆32Updated last month
- [Paper][AAAI2024]Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations☆114Updated 5 months ago
- [CVPR 2024] FairCLIP: Harnessing Fairness in Vision-Language Learning☆51Updated 3 months ago
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆41Updated last year
- Code for the paper "RECAP: Towards Precise Radiology Report Generation via Dynamic Disease Progression Reasoning" (EMNLP'23 Findings).☆23Updated 6 months ago
- Code for the paper "ORGAN: Observation-Guided Radiology Report Generation via Tree Reasoning" (ACL'23).☆49Updated last month
- OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding☆29Updated last week
- Official code repository for "Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning" (published at ICLR 202…☆55Updated last year
- [CVPR 2024]Instance-level Expert Knowledge and Aggregate Discriminative Attention for Radiology Report Generation☆17Updated last month
- ViLLA: Fine-grained vision-language representation learning from real-world data☆40Updated last year
- [arXiv'23] HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding☆33Updated 3 months ago
- [ACMMM-2022] This is the official implementation of Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Know…☆36Updated last year
- The official start-up code for paper "FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark."☆53Updated 2 years ago
- ☆22Updated 6 months ago
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆44Updated 6 months ago
- Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)☆67Updated 9 months ago
- S-CLIP: Semi-supervised Vision-Language Pre-training using Few Specialist Captions☆45Updated last year
- Official implementation of "Why are Visually-Grounded Language Models Bad at Image Classification?" (NeurIPS 2024)☆51Updated last month
- [EMNLP'24] Code and data for paper "Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models"☆63Updated last month
- [ECCV2022] The official implementation of Cross-modal Prototype Driven Network for Radiology Report Generation☆66Updated 10 months ago
- [ICCV-2023] Towards Unifying Medical Vision-and-Language Pre-training via Soft Prompts☆63Updated 8 months ago
- Radiology Report Generation with Frozen LLMs☆53Updated 7 months ago
- Multi-Aspect Vision Language Pretraining - CVPR2024☆64Updated 3 months ago
- Awesome List of Vision Language Prompt Papers☆37Updated last year
- [NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models☆34Updated 8 months ago
- The official GitHub repository of the AAAI-2024 paper "Bootstrapping Large Language Models for Radiology Report Generation".☆42Updated 6 months ago
- [CVPRW 2024] LaPA: Latent Prompt Assist Model For Medical Visual Question Answering☆16Updated 4 months ago
- ☆34Updated 2 years ago
- [ICLR2023] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models☆146Updated 11 months ago
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆28Updated 8 months ago