Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"
☆35Dec 5, 2022Updated 3 years ago
Alternatives and similar repositories for CPL
Users that are interested in CPL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Dataset for Bilingual VLN☆11Dec 5, 2020Updated 5 years ago
- PyTorch code for EMNLP 2020 paper "X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers"☆50Aug 27, 2021Updated 4 years ago
- The good practice in the VQA system such as pos-tag attention, structed triplet learning and triplet attention is very general and can be…☆19Jan 23, 2018Updated 8 years ago
- Implementation of "Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation"☆27Mar 4, 2021Updated 5 years ago
- Official implementation of ICML 2025 paper "Understanding Multimodal LLMs Under Distribution Shifts: An Information-Theoretic Approach"☆12May 27, 2025Updated 11 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": LXMERT…☆21Oct 20, 2020Updated 5 years ago
- Official repo for "Imagination-Augmented Natural Language Understanding", NAACL 2022.☆17Aug 30, 2022Updated 3 years ago
- This repo contains all the codes for SEScore implementation☆15Mar 3, 2025Updated last year
- ☆15May 10, 2021Updated 5 years ago
- ☆12May 26, 2022Updated 3 years ago
- ☆17Mar 3, 2025Updated last year
- Project for Dynamic Capsule Attention☆12Dec 7, 2019Updated 6 years ago
- Re-implementation for 'R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering'.☆12Mar 13, 2026Updated 2 months ago
- Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation☆35Jun 30, 2025Updated 10 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Official Github repo of the VIST Challenge NAACL 2018☆17Aug 3, 2018Updated 7 years ago
- Know What and Know Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation☆16Feb 7, 2022Updated 4 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆211Dec 18, 2022Updated 3 years ago
- Official repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"☆28Jul 15, 2025Updated 10 months ago
- DeVLBert: Learning Deconfounded Visio-Linguistic Representations☆27Nov 27, 2022Updated 3 years ago
- [ECCV 2022] Official pytorch implementation of the paper "FedVLN: Privacy-preserving Federated Vision-and-Language Navigation"☆13Oct 8, 2022Updated 3 years ago
- Pytorch Code and Data for EnvEdit: Environment Editing for Vision-and-Language Navigation (CVPR 2022)☆30Aug 2, 2022Updated 3 years ago
- ☆27Jun 22, 2024Updated last year
- vqa drived by bottom-up and top-down attention and knowledge☆14Nov 21, 2018Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Shows visual grounding methods can be right for the wrong reasons! (ACL 2020)☆23Jun 26, 2020Updated 5 years ago
- ☆22Jan 14, 2026Updated 4 months ago
- Video captioning baseline models on Video2Commonsense Dataset.☆56Apr 15, 2021Updated 5 years ago
- [ACMMM 2022] Learning from Different text-image Pairs: A Relation-enhanced Graph Convolutional Network for Multimodal NER☆18Feb 21, 2023Updated 3 years ago
- Unofficial reimplementation of Dynamic Fusion with Intra- and Inter-modality Attention Flow for Visual Question Answering☆18Oct 30, 2019Updated 6 years ago
- [NeurIPS'24] I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing☆33Dec 9, 2025Updated 5 months ago
- Code release for Fried et al., Speaker-Follower Models for Vision-and-Language Navigation. in NeurIPS, 2018.☆137Nov 22, 2022Updated 3 years ago
- Official code for the ACL 2021 Findings paper "Yichi Zhang and Joyce Chai. Hierarchical Task Learning from Language Instructions with Uni…☆24Jun 28, 2021Updated 4 years ago
- ☆27Oct 7, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- vist story telling evaluation tool☆21Dec 5, 2023Updated 2 years ago
- Repository of paper Consistency-preserving Visual Question Answering in Medical Imaging (MICCAI2022)☆26Mar 28, 2023Updated 3 years ago
- Controllable mage captioning model with unsupervised modes☆21Apr 14, 2023Updated 3 years ago
- Pytorch implementation for our NeurIPS 2019 paper "TAB-VCR: Tags and Attributes based VCR Baselines" https://arxiv.org/abs/1910.14671☆19May 6, 2021Updated 5 years ago
- code for "Multitask Vision-Language Prompt Tuning" https://arxiv.org/abs/2211.11720☆57Jun 5, 2024Updated last year
- Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''☆41Sep 9, 2019Updated 6 years ago
- Recursive Visual Programming (ECCV 2024)☆18Nov 20, 2024Updated last year