Caffe implementation of paper: "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering"
☆29Oct 24, 2018Updated 7 years ago
Alternatives and similar repositories for up-down-captioner
Users that are interested in up-down-captioner are comparing it to the libraries listed below
Sorting:
- vqa drived by bottom-up and top-down attention and knowledge☆14Nov 21, 2018Updated 7 years ago
- ☆14Jan 30, 2017Updated 9 years ago
- Pytorch Implementation of Videos as Space-Time Region Graphs☆27May 30, 2025Updated 9 months ago
- Deliberate Attention Networks for Image Captioning (AAAI 2019)☆11Sep 30, 2019Updated 6 years ago
- ☆16Dec 17, 2018Updated 7 years ago
- Study of frame rate effects on MSR-VTT dataset☆14Feb 10, 2018Updated 8 years ago
- Code for ECCV 2020 paper "Hierarchical Visual-Textual Graph for Temporal Activity Localization via Language"☆17Aug 25, 2020Updated 5 years ago
- ☆15Jul 23, 2019Updated 6 years ago
- Extension of hLSTMat☆19Apr 15, 2021Updated 4 years ago
- ☆20Sep 19, 2019Updated 6 years ago
- The paper of "Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning" accepted in International Joint Conference on Arti…☆16Jun 29, 2017Updated 8 years ago
- The implementation of Sequential VLAD in Pytorch☆20Jun 20, 2019Updated 6 years ago
- Towards Diverse and Natural Image Descriptions via a Conditional GAN☆75Dec 2, 2017Updated 8 years ago
- Contrastive Learning for Image Captioning☆51Feb 22, 2018Updated 8 years ago
- Codes for paper of "Attention-based LSTM with Semantic Consistency for Videos Captioning "☆18Mar 22, 2017Updated 8 years ago
- An image captioning model that is inspired by the Show, Attend and Tell paper (https://arxiv.org/abs/1502.03044) and the Sequence Generat…☆22Sep 4, 2020Updated 5 years ago
- Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome☆1,466Feb 3, 2023Updated 3 years ago
- A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning☆25Sep 4, 2020Updated 5 years ago
- Code for Discriminability objective for training descriptive captions(CVPR 2018)☆109Nov 21, 2019Updated 6 years ago
- An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.☆765Mar 10, 2024Updated last year
- Source code for the paper "Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training"☆66Apr 18, 2019Updated 6 years ago
- Video Captioning on MSR-VTT and MSVD dataset using Deep Learning☆21Aug 14, 2020Updated 5 years ago
- Codebase for CVPR 2020 paper "Spatio-Temporal Graph for Video Captioning with Knowledge Distillation"☆23Mar 4, 2020Updated 6 years ago
- Pytorch implementation of audio-visual fusion video captioning model☆27Jul 26, 2018Updated 7 years ago
- Evaluation code for Dense-Captioning Events in Videos☆130Jun 11, 2019Updated 6 years ago
- Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)☆37May 16, 2022Updated 3 years ago
- Code for reproducing the results in "Learning to Detect Human-Object Interactions"☆65Jun 10, 2024Updated last year
- Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)☆134Jul 25, 2024Updated last year
- Co-attending Regions and Detections for VQA.☆40Jun 2, 2018Updated 7 years ago
- implement video caption based on openNMT☆36Apr 19, 2018Updated 7 years ago
- ☆33Apr 20, 2018Updated 7 years ago
- ACM ICMR 2019《Cross-Modal Video Moment Retrieval with Spatial and Language-Temporal Attention》☆36Jun 19, 2019Updated 6 years ago
- Human-like Controllable Image Captioning with Verb-specific Semantic Roles.☆36Mar 11, 2022Updated 3 years ago
- [NeurIPS'25 Spotlight] This is the official codebase for the paper: STAR: A Benchmark for Astronomical Star Fields Super-Resolution☆15Oct 9, 2025Updated 4 months ago
- Statistical discontinuous constituent parsing☆11Feb 15, 2018Updated 8 years ago
- Chinese word segmentation with the neural seq2seq model implement in pytorch☆10Dec 13, 2017Updated 8 years ago
- ☆12May 25, 2023Updated 2 years ago
- Tensorflow implementation of "Dynamic Memory Networks for Visual and Textual Question Answering"☆79Mar 22, 2018Updated 7 years ago
- Official Pytorch implementation for AAAI2021 paper (RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning)☆37Nov 5, 2021Updated 4 years ago