google / tirg
deep learning, image retrieval, vision and language
☆303Updated 4 years ago
Alternatives and similar repositories for tirg
Users that are interested in tirg are comparing it to the libraries listed below
Sorting:
- CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval☆127Updated 5 years ago
- Code accompanying the paper "Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning".☆211Updated 4 years ago
- Multi-Modal Transformer for Video Retrieval☆259Updated 7 months ago
- PyTorch code for ICCV'19 paper "Visual Semantic Reasoning for Image-Text Matching"☆301Updated 5 years ago
- Video embeddings for retrieval with natural language queries☆341Updated 2 years ago
- A PyTorch reimplementation of bottom-up-attention models☆301Updated 3 years ago
- PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"☆507Updated 3 years ago
- This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP☆413Updated 2 years ago
- Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)☆134Updated last year
- Vision-Language Pre-training for Image Captioning and Question Answering☆418Updated 3 years ago
- ☆154Updated 3 years ago
- Code for Unsupervised Image Captioning☆217Updated 2 years ago
- This repository contains an implementation of the models introduced in the paper Dialog-based Interactive Image Retrieval. The network is…☆69Updated 4 years ago
- Faster RCNN model in Pytorch version, pretrained on the Visual Genome with ResNet 101☆237Updated 2 years ago
- Implementation of our CVPR2020 paper, Graph Structured Network for Image-Text Matching☆167Updated 4 years ago
- [CVPR2019] Dual Encoding for Zero-Example Video Retrieval☆153Updated 2 years ago
- PyTorch bottom-up attention with Detectron2☆233Updated 3 years ago
- code for our CVPR2020 paper "IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval"☆94Updated 5 years ago
- Grid features pre-training code for visual question answering☆269Updated 3 years ago
- project page for VinVL☆355Updated last year
- Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning☆84Updated 4 years ago
- PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)☆563Updated 2 years ago
- Mixture-of-Embeddings-Experts☆119Updated 4 years ago
- Multi Task Vision and Language☆811Updated 3 years ago
- MUREL (CVPR 2019), a multimodal relational reasoning module for VQA☆195Updated 5 years ago
- Automatic image captioning model based on Caffe, using features from bottom-up attention.☆245Updated 2 years ago
- Efficient Diffusion for Image Retrieval☆222Updated 5 years ago
- Video Grounding and Captioning☆326Updated 3 years ago
- A lightweight, scalable, and general framework for visual question answering research☆323Updated 3 years ago
- PyTorch implementation of Image captioning with Bottom-up, Top-down Attention☆166Updated 6 years ago