CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
☆22Apr 15, 2022Updated 3 years ago
Alternatives and similar repositories for UVLP
Users that are interested in UVLP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A pytorch implemetation of data augmentation method for visual question answering☆21May 25, 2023Updated 2 years ago
- CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training☆34Nov 9, 2021Updated 4 years ago
- ☆12Mar 8, 2021Updated 5 years ago
- pytorch implementation of mvp: a multi-stage vision-language pre-training framework☆11Apr 23, 2022Updated 3 years ago
- ☆18Jun 10, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Implementation for CVPR 2020 Paper "Two Causal Principles for Improving Visual Dialog"☆31Feb 19, 2023Updated 3 years ago
- ☆17Mar 13, 2023Updated 3 years ago
- Data of ACL 2019 Paper "Expressing Visual Relationships via Language".☆63Sep 30, 2020Updated 5 years ago
- [ECCV'22 Poster] Explicit Image Caption Editing☆22Nov 30, 2022Updated 3 years ago
- "Describing Textures using Natural Language" code and data, ECCV 2020 Oral.☆17Aug 6, 2020Updated 5 years ago
- Starter code for the VMT task and challenge☆51Jul 29, 2020Updated 5 years ago
- Feature resources of "Diagnosing the Environment Bias in Vision-and-Language Navigation"☆16May 6, 2020Updated 5 years ago
- support Large Vocabulary Instance Segmentation (LVIS) dataset for mmdetection☆16Apr 24, 2020Updated 5 years ago
- SelfCriticalSequenceTrainingforImageCaptioning☆21May 27, 2017Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆24Apr 4, 2022Updated 3 years ago
- [EMNLP2022] Transformer-based Entity Typing in Knowledge Graphs☆16Nov 26, 2024Updated last year
- Visual Relation Grounding in Videos (ECCV'20, Spotlight)☆57Dec 8, 2022Updated 3 years ago
- Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"☆26Oct 20, 2022Updated 3 years ago
- Pre-trained V+L Data Preparation☆46Jun 2, 2020Updated 5 years ago
- A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)☆149Nov 18, 2020Updated 5 years ago
- A video retrieval dataset How2R and a video QA dataset How2QA☆24Oct 15, 2020Updated 5 years ago
- Project page for "Visual Grounding in Video for Unsupervised Word Translation" CVPR 2020☆43Apr 26, 2020Updated 5 years ago
- source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT☆72Nov 14, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Counterfactual Samples Synthesizing for Robust VQA☆79Nov 24, 2022Updated 3 years ago
- my Ph.D. thesis (Zhejiang University)☆38Apr 9, 2022Updated 3 years ago
- ☆44Aug 2, 2021Updated 4 years ago
- ORES: Open-vocabulary Responsible Visual Synthesis☆14Dec 12, 2023Updated 2 years ago
- Resources for paper "DialSummEval: Revisiting summarization evaluation for dialogues"☆15Jul 22, 2025Updated 8 months ago
- Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware P…☆59Mar 24, 2023Updated 3 years ago
- Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering☆29Jul 1, 2024Updated last year
- [ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"☆69Oct 11, 2021Updated 4 years ago
- [EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction☆51Aug 20, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code for DVD A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue☆14Oct 12, 2021Updated 4 years ago
- [ECCV2022] New benchmark for evaluating pre-trained model; New supervised contrastive learning framework.☆110Dec 8, 2023Updated 2 years ago
- Referring Expression Parser☆27Feb 10, 2018Updated 8 years ago
- Visual Coreference Resolution in Visual Dialog using Neural Module Networks☆57Oct 12, 2021Updated 4 years ago
- [ECCV 2022] FashionViL: Fashion-Focused V+L Representation Learning☆61Nov 15, 2022Updated 3 years ago
- ☆12Dec 20, 2024Updated last year
- Official code for our paper "Model Composition for Multimodal Large Language Models" (ACL 2024)☆31Jan 8, 2025Updated last year