jayleicn / recurrent-transformer
[ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
☆166Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for recurrent-transformer
- [ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering☆123Updated 2 years ago
- Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021☆64Updated 3 years ago
- Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*☆29Updated 3 years ago
- Starter Code for VALUE benchmark☆79Updated 2 years ago
- Video captioning baseline models on Video2Commonsense Dataset.☆57Updated 3 years ago
- [ECCV 2020] PyTorch code for XML on TVRetrieval dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval☆153Updated 5 months ago
- Second-place solution to dense video captioning task in ActivityNet Challenge (CVPR 2020 workshop)☆73Updated 3 years ago
- [CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)☆67Updated 4 years ago
- Data and code for CVPR 2020 paper: "VIOLIN: A Large-Scale Dataset for Video-and-Language Inference"☆159Updated 4 years ago
- Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"☆230Updated 3 years ago
- Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised man…☆104Updated 4 years ago
- A curated list of research papers in Video Captioning☆118Updated 3 years ago
- [CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)☆57Updated 3 years ago
- A length-controllable and non-autoregressive image captioning model.☆66Updated 3 years ago
- [ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset☆88Updated last year
- IJCAI2020: Learning to Discretely Compose Reasoning Module Networks for Video Captioning☆79Updated 3 years ago
- Code for the paper: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos☆68Updated 3 years ago
- CLIP-It! Language-Guided Video Summarization☆73Updated 3 years ago
- Repository for the CVPR-20 paper "Local-Global Video-Text Interactions for Temporal Grounding"☆130Updated 3 years ago
- PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)☆143Updated last year
- Official pytorch implementation of the AAAI 2021 paper "Semantic Grouping Network for Video Captioning"☆52Updated 3 years ago
- S3D Text-Video model trained on HowTo100M using MIL-NCE☆191Updated 4 years ago
- Implementation for MAF: Multimodal Alignment Framework☆43Updated 3 years ago
- [CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning☆91Updated 7 months ago
- A one-stop shop for YouCook2 info such as leaderboard and recent advances on (cooking) video retrieval and captioning.☆38Updated 2 years ago
- [EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction☆48Updated 2 years ago
- A PyTorch implementation of VIOLET