zmykevin / UVLPView external linksLinks
CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
☆22Apr 15, 2022Updated 3 years ago
Alternatives and similar repositories for UVLP
Users that are interested in UVLP are comparing it to the libraries listed below
Sorting:
- Code for "Counterfactual Variable Control for Robust and Interpretable Question Answering"☆14Oct 13, 2020Updated 5 years ago
- A pytorch implemetation of data augmentation method for visual question answering☆21May 25, 2023Updated 2 years ago
- "Describing Textures using Natural Language" code and data, ECCV 2020 Oral.☆17Aug 6, 2020Updated 5 years ago
- CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training☆34Nov 9, 2021Updated 4 years ago
- ☆24Apr 4, 2022Updated 3 years ago
- Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"☆26Oct 20, 2022Updated 3 years ago
- Code for DVD A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue☆14Oct 12, 2021Updated 4 years ago
- A complete end-to-end Deep Learning system to generate high quality human like speech in English for Korean Drama (WIP)☆13Sep 17, 2022Updated 3 years ago
- ☆12Mar 8, 2021Updated 4 years ago
- pytorch implementation of mvp: a multi-stage vision-language pre-training framework☆11Apr 23, 2022Updated 3 years ago
- ORES: Open-vocabulary Responsible Visual Synthesis☆14Dec 12, 2023Updated 2 years ago
- Feature resources of "Diagnosing the Environment Bias in Vision-and-Language Navigation"☆16May 6, 2020Updated 5 years ago
- Counterfactual Samples Synthesizing for Robust VQA☆79Nov 24, 2022Updated 3 years ago
- Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware P…☆59Mar 24, 2023Updated 2 years ago
- Data Release for VALUE Benchmark☆30Feb 16, 2022Updated 3 years ago
- Starter code for the VMT task and challenge☆51Jul 29, 2020Updated 5 years ago
- [EMNLP2022] Transformer-based Entity Typing in Knowledge Graphs☆16Nov 26, 2024Updated last year
- This repository provides the dataset introduced by our WSSTG paper☆13Jul 21, 2019Updated 6 years ago
- An image-oriented evaluation tool for image captioning systems (EMNLP-IJCNLP 2019)☆37May 3, 2020Updated 5 years ago
- Visual Relation Grounding in Videos (ECCV'20, Spotlight)☆57Dec 8, 2022Updated 3 years ago
- Data of ACL 2019 Paper "Expressing Visual Relationships via Language".☆62Sep 30, 2020Updated 5 years ago
- End-to-end Multi-modal Video Temporal Grounding, NeurIPS 2021☆18Oct 24, 2021Updated 4 years ago
- Code for ACL 2020 paper "Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA." Hyounghun Kim, Zineng T…☆34May 14, 2020Updated 5 years ago
- Code release for Hu et al., Language-Conditioned Graph Networks for Relational Reasoning. in ICCV, 2019☆92Aug 9, 2019Updated 6 years ago
- A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)☆149Nov 18, 2020Updated 5 years ago
- Implementation for CVPR 2020 Paper "Two Causal Principles for Improving Visual Dialog"☆31Feb 19, 2023Updated 2 years ago
- Code for paper, "TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency" ECCV 2022☆39Feb 17, 2023Updated 2 years ago
- Repository of our paper 'Refer-it-in-RGBD' in CVPR 2021☆43May 24, 2024Updated last year
- support Large Vocabulary Instance Segmentation (LVIS) dataset for mmdetection☆16Apr 24, 2020Updated 5 years ago
- ☆18Jun 10, 2024Updated last year
- Human-like Controllable Image Captioning with Verb-specific Semantic Roles.☆36Mar 11, 2022Updated 3 years ago
- my Ph.D. thesis (Zhejiang University)☆38Apr 9, 2022Updated 3 years ago
- MAC: Mining Activity Concepts for Language-based Temporal Localization☆36Nov 26, 2018Updated 7 years ago
- Code for ECCV 2020 paper "Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language"☆17Aug 25, 2020Updated 5 years ago
- SelfCriticalSequenceTrainingforImageCaptioning☆21May 27, 2017Updated 8 years ago
- ☆44Aug 2, 2021Updated 4 years ago
- source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT☆72Nov 14, 2022Updated 3 years ago
- Multi-faceted Video Moment Localizer☆17Jun 19, 2020Updated 5 years ago
- ☆17Mar 13, 2023Updated 2 years ago