zmykevin/UVLP

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zmykevin/UVLP)

zmykevin / UVLP

CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment

☆21

Alternatives and similar repositories for UVLP

Users that are interested in UVLP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

PluviophileYU / CVC-QA
View on GitHub
Code for "Counterfactual Variable Control for Robust and Interpretable Question Answering"
☆14Oct 13, 2020Updated 5 years ago
zaynmi / seada-vqa
View on GitHub
A pytorch implemetation of data augmentation method for visual question answering
☆21May 25, 2023Updated 3 years ago
zmykevin / UC2
View on GitHub
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
☆34Nov 9, 2021Updated 4 years ago
FudanDISC / DISCOpen-MVPTR
View on GitHub
pytorch implementation of mvp: a multi-stage vision-language pre-training framework
☆11Apr 23, 2022Updated 4 years ago
simpleshinobu / visdial-principles
View on GitHub
Implementation for CVPR 2020 Paper "Two Causal Principles for Improving Visual Dialog"
☆31Feb 19, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
JoyHuYY1412 / Class_Imbalanced_Semi_Supervised_Learning
View on GitHub
☆17Mar 13, 2023Updated 3 years ago
airsplay / VisualRelationships
View on GitHub
Data of ACL 2019 Paper "Expressing Visual Relationships via Language".
☆63Sep 30, 2020Updated 5 years ago
baaaad / ECE
View on GitHub
[ECCV'22 Poster] Explicit Image Caption Editing
☆22Nov 30, 2022Updated 3 years ago
ChenyunWu / DescribingTextures
View on GitHub
"Describing Textures using Natural Language" code and data, ECCV 2020 Oral.
☆17Aug 6, 2020Updated 5 years ago
eric-xw / Video-guided-Machine-Translation
View on GitHub
Starter code for the VMT task and challenge
☆51Jul 29, 2020Updated 5 years ago
zhangybzbo / EnvBiasVLN
View on GitHub
Feature resources of "Diagnosing the Environment Bias in Vision-and-Language Navigation"
☆16May 6, 2020Updated 6 years ago
zhiweihu1103 / ET-TET
View on GitHub
[EMNLP2022] Transformer-based Entity Typing in Knowledge Graphs
☆15Nov 26, 2024Updated last year
KaihuaTang / LVIS-for-mmdetection
View on GitHub
support Large Vocabulary Instance Segmentation (LVIS) dataset for mmdetection
☆16Apr 24, 2020Updated 6 years ago
Weili-NLP / SelfCriticalSequenceTraining-tensorflow
View on GitHub
SelfCriticalSequenceTrainingforImageCaptioning
☆21May 27, 2017Updated 9 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
YuanEZhou / CBTrans
View on GitHub
☆24Apr 4, 2022Updated 4 years ago
ShiYaya / emscore
View on GitHub
Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"
☆26Oct 20, 2022Updated 3 years ago
zjuchenlong / Thesis-latex
View on GitHub
my Ph.D. thesis (Zhejiang University)
☆38Apr 9, 2022Updated 4 years ago
doc-doc / vRGV
View on GitHub
Visual Relation Grounding in Videos (ECCV'20, Spotlight)
☆57Dec 8, 2022Updated 3 years ago
lichengunc / pretrain-vl-data
View on GitHub
Pre-trained V+L Data Preparation
☆47Jun 2, 2020Updated 6 years ago
zyang-ur / onestage_grounding
View on GitHub
A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)
☆150Nov 18, 2020Updated 5 years ago
ych133 / How2R-and-How2QA
View on GitHub
A video retrieval dataset How2R and a video QA dataset How2QA
☆24Oct 15, 2020Updated 5 years ago
yanxinzju / CSS-VQA
View on GitHub
Counterfactual Samples Synthesizing for Robust VQA
☆78Nov 24, 2022Updated 3 years ago
gchhablani / multilingual-image-captioning
View on GitHub
☆43Aug 2, 2021Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kodenii / ORES
View on GitHub
ORES: Open-vocabulary Responsible Visual Synthesis
☆14Dec 12, 2023Updated 2 years ago
cdancette / detect-shortcuts
View on GitHub
Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering
☆29Jul 1, 2024Updated 2 years ago
jialinwu17 / self_critical_vqa
View on GitHub
Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''
☆40Sep 9, 2019Updated 6 years ago
ronghanghu / lcgn
View on GitHub
Code release for Hu et al., Language-Conditioned Graph Networks for Relational Reasoning. in ICCV, 2019
☆92Aug 9, 2019Updated 6 years ago
xuwangyin / pytorch-tutorial
View on GitHub
PyTorch Tutorial for Deep Learning Researchers
☆11Aug 5, 2017Updated 8 years ago
THUNLP-MT / ActiView
View on GitHub
☆11Dec 20, 2024Updated last year
THUNLP-MT / ModelCompose
View on GitHub
Official code for our paper "Model Composition for Multimodal Large Language Models" (ACL 2024)
☆31Jan 8, 2025Updated last year
jayleicn / VideoLanguageFuturePred
View on GitHub
[EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction
☆52Aug 20, 2022Updated 3 years ago
facebookresearch / DVDialogues
View on GitHub
Code for DVD A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue
☆14Oct 12, 2021Updated 4 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
lichengunc / refer-parser2
View on GitHub
Referring Expression Parser
☆27Feb 10, 2018Updated 8 years ago
facebookresearch / corefnmn
View on GitHub
Visual Coreference Resolution in Visual Dialog using Neural Module Networks
☆58Oct 12, 2021Updated 4 years ago
BrandonHanx / mmf
View on GitHub
[ECCV 2022] FashionViL: Fashion-Focused V+L Representation Learning
☆61Nov 15, 2022Updated 3 years ago
mad-red / VSR-guided-CIC
View on GitHub
Human-like Controllable Image Captioning with Verb-specific Semantic Roles.
☆36Mar 11, 2022Updated 4 years ago
medhini / Instructional-Video-Summarization
View on GitHub
Code for paper, "TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency" ECCV 2022
☆39Feb 17, 2023Updated 3 years ago
UKPLab / emnlp2020-debiasing-unknown
View on GitHub
☆26Apr 15, 2021Updated 5 years ago
jayleicn / singularity
View on GitHub
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
☆136May 5, 2023Updated 3 years ago