rowanz/r2c

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/rowanz/r2c)

rowanz / r2c

Recognition to Cognition Networks (code for the model in "From Recognition to Cognition: Visual Commonsense Reasoning", CVPR 2019)

☆469

Alternatives and similar repositories for r2c

Users that are interested in r2c are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Deanplayerljx / tab-vcr
View on GitHub
Pytorch implementation for our NeurIPS 2019 paper "TAB-VCR: Tags and Attributes based VCR Baselines" https://arxiv.org/abs/1910.14671
☆19May 6, 2021Updated 5 years ago
AmingWu / CCN
View on GitHub
Connective Cognition Network for Directional Visual Commonsense Reasoning
☆15May 6, 2021Updated 5 years ago
jiasenlu / vilbert_beta
View on GitHub
☆478Nov 21, 2022Updated 3 years ago
rowanz / neural-motifs
View on GitHub
Code for Neural Motifs: Scene Graph Parsing with Global Context (CVPR 2018)
☆545Aug 9, 2019Updated 6 years ago
jamespark3922 / visual-comet
View on GitHub
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
☆87Jun 12, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hengyuan-hu / bottom-up-attention-vqa
View on GitHub
An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.
☆768Mar 10, 2024Updated 2 years ago
TheShadow29 / visual-commonsense-pytorch
View on GitHub
For visual commonsense model
☆34Apr 12, 2019Updated 7 years ago
jayleicn / TVQA
View on GitHub
[EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering
☆181Oct 25, 2022Updated 3 years ago
airsplay / lxmert
View on GitHub
PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
☆966Oct 22, 2022Updated 3 years ago
stanfordnlp / mac-network
View on GitHub
Implementation for the paper "Compositional Attention Networks for Machine Reasoning" (Hudson and Manning, ICLR 2018)
☆511Jul 10, 2021Updated 5 years ago
rowanz / swagaf
View on GitHub
Repository for paper "SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference"
☆179Aug 14, 2020Updated 5 years ago
facebookresearch / mmf
View on GitHub
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
☆5,634Jul 7, 2026Updated 2 weeks ago
yuweijiang / HGL-pytorch
View on GitHub
Code for the model "Heterogeneous Graph Learning for Visual Commonsense Reasoning (NeurlPS 2019)"
☆47Jul 27, 2020Updated 5 years ago
jackroos / VL-BERT
View on GitHub
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".
☆743May 22, 2023Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
jayleicn / TVQAplus
View on GitHub
[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering
☆132Oct 25, 2022Updated 3 years ago
jwyang / graph-rcnn.pytorch
View on GitHub
[ECCV 2018] Official code for "Graph R-CNN for Scene Graph Generation"
☆748Apr 1, 2020Updated 6 years ago
jayleicn / VideoLanguageFuturePred
View on GitHub
[EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction
☆52Aug 20, 2022Updated 3 years ago
peteanderson80 / bottom-up-attention
View on GitHub
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
☆1,470Feb 3, 2023Updated 3 years ago
rowanz / merlot
View on GitHub
MERLOT: Multimodal Neural Script Knowledge Models
☆226Mar 15, 2022Updated 4 years ago
jz462 / Large-Scale-VRD.pytorch
View on GitHub
Implementation for the AAAI2019 paper "Large-scale Visual Relationship Understanding"
☆144Sep 3, 2019Updated 6 years ago
Cadene / murel.bootstrap.pytorch
View on GitHub
MUREL (CVPR 2019), a multimodal relational reasoning module for VQA
☆194Feb 9, 2020Updated 6 years ago
PKU-ICST-MIPL / CKRM_TCSVT2020
View on GitHub
Source code of our TCSVT 2020 paper "Multi-level Knowledge Injecting for Visual Commonsense Reasoning"
☆11Sep 18, 2024Updated last year
lichengunc / MAttNet
View on GitHub
MAttNet: Modular Attention Network for Referring Expression Comprehension
☆299Nov 29, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
necla-ml / SNLI-VE
View on GitHub
Dataset and starting code for visual entailment dataset
☆123Apr 21, 2022Updated 4 years ago
jokieleung / awesome-visual-question-answering
View on GitHub
A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Common…
☆672Jul 6, 2023Updated 3 years ago
yikang-li / MSDN
View on GitHub
This is our PyTorch implementation of Multi-level Scene Description Network (MSDN) proposed in our ICCV 2017 paper.
☆229Nov 19, 2019Updated 6 years ago
henryhungle / MTN
View on GitHub
Code for the paper Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems (ACL19)
☆100Oct 17, 2022Updated 3 years ago
facebookresearch / EmbodiedQA
View on GitHub
Train embodied agents that can answer questions in environments
☆315Jul 25, 2023Updated 2 years ago
yuweihao / KERN
View on GitHub
Code for Knowledge-Embedded Routing Network for Scene Graph Generation (CVPR 2019)
☆121Aug 17, 2022Updated 3 years ago
ChenRocks / UNITER
View on GitHub
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
☆800Jun 30, 2021Updated 5 years ago
facebookresearch / ActivityNet-Entities
View on GitHub
A Dataset for Grounded Video Description
☆165Jan 4, 2022Updated 4 years ago
google-research-datasets / conceptual-captions
View on GitHub
Conceptual Captions is a dataset containing (image-URL, caption) pairs designed for the training and evaluation of machine learned image …
☆567Aug 21, 2021Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
SinghJasdeep / Attention-on-Attention-for-VQA
View on GitHub
Visual Question Answering Project with state of the art single Model performance.
☆130Jun 18, 2018Updated 8 years ago
jimmy646 / violin
View on GitHub
Data and code for CVPR 2020 paper: "VIOLIN: A Large-Scale Dataset for Video-and-Language Inference"
☆161Apr 29, 2020Updated 6 years ago
allenai / visual-reasoning-rationalization
View on GitHub
Code associated with the "Natural Language Rationales with Full-Stack Visual Reasoning" EMNLP Findings 2020 paper
☆24Jan 15, 2021Updated 5 years ago
JunweiLiang / FVTA_MemexQA
View on GitHub
Real-world photo sequence question answering system (MemexQA). CVPR'18 and TPAMI'19
☆33Jul 1, 2019Updated 7 years ago
facebookresearch / clevr-iep
View on GitHub
Inferring and Executing Programs for Visual Reasoning
☆805Aug 30, 2021Updated 4 years ago
aimbrain / vqa-project
View on GitHub
Code for our paper: Learning Conditioned Graph Structures for Interpretable Visual Question Answering
☆150Mar 11, 2019Updated 7 years ago
ronghanghu / snmn
View on GitHub
Code release for Hu et al., Explainable Neural Computation via Stack Neural Module Networks. in ECCV, 2018
☆71Nov 17, 2019Updated 6 years ago