Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"
☆35Dec 5, 2022Updated 3 years ago
Alternatives and similar repositories for CPL
Users that are interested in CPL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Dataset for Bilingual VLN☆11Dec 5, 2020Updated 5 years ago
- PyTorch code for EMNLP 2020 paper "X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers"☆50Aug 27, 2021Updated 4 years ago
- The good practice in the VQA system such as pos-tag attention, structed triplet learning and triplet attention is very general and can be…☆19Jan 23, 2018Updated 8 years ago
- Implementation of "Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation"☆27Mar 4, 2021Updated 5 years ago
- Official implementation of ICML 2025 paper "Understanding Multimodal LLMs Under Distribution Shifts: An Information-Theoretic Approach"☆12May 27, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ICLR2023] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models☆176Dec 14, 2023Updated 2 years ago
- Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": LXMERT…☆21Oct 20, 2020Updated 5 years ago
- Official repo for "Imagination-Augmented Natural Language Understanding", NAACL 2022.☆17Aug 30, 2022Updated 3 years ago
- ☆61May 2, 2025Updated last year
- This repo contains all the codes for SEScore implementation☆15Mar 3, 2025Updated last year
- ☆15May 10, 2021Updated 5 years ago
- A pytorch implemetation of data augmentation method for visual question answering☆21May 25, 2023Updated 2 years ago
- ☆12May 26, 2022Updated 3 years ago
- Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"☆29Apr 27, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆17Mar 3, 2025Updated last year
- Re-implementation for 'R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering'.☆12Mar 13, 2026Updated 2 months ago
- Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation☆35Jun 30, 2025Updated 10 months ago
- Official Github repo of the VIST Challenge NAACL 2018☆17Aug 3, 2018Updated 7 years ago
- Know What and Know Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation☆16Feb 7, 2022Updated 4 years ago
- Official code and dataset link for ''VMSMO: Learning to Generate Multimodal Summary for Video-based News Articles''☆36Jul 30, 2021Updated 4 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆211Dec 18, 2022Updated 3 years ago
- Official repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"☆28Jul 15, 2025Updated 10 months ago
- Pytorch Code and Data for EnvEdit: Environment Editing for Vision-and-Language Navigation (CVPR 2022)☆30Aug 2, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆27Jun 22, 2024Updated last year
- vqa drived by bottom-up and top-down attention and knowledge☆14Nov 21, 2018Updated 7 years ago
- Shows visual grounding methods can be right for the wrong reasons! (ACL 2020)☆23Jun 26, 2020Updated 5 years ago
- ☆22Jan 14, 2026Updated 4 months ago
- Video captioning baseline models on Video2Commonsense Dataset.☆56Apr 15, 2021Updated 5 years ago
- [ACMMM 2022] Learning from Different text-image Pairs: A Relation-enhanced Graph Convolutional Network for Multimodal NER☆18Feb 21, 2023Updated 3 years ago
- Counterfactual Samples Synthesizing for Robust VQA