ppapalampidi / GraphTP
Source code for the AAAI 2021 paper "Movie Summarization via Sparse Graph Construction"
☆30Updated 3 years ago
Related projects: ⓘ
- TuRnIng POint Dataset☆46Updated 4 years ago
- Screenplay Summarization using Latent Narrative Structure☆35Updated 2 years ago
- The code and output of our AAAI paper "Knowledge-Enriched Visual Storytelling"☆40Updated 3 years ago
- Github repository for Plot and Rework: Modeling Storylines for Visual Storytelling (ACL-IJCNLP2021 Findings)☆20Updated 2 years ago
- Learning Interactions and Relationships between Movie Characters (CVPR'20)☆21Updated last year
- Official code and dataset link for ''VMSMO: Learning to Generate Multimodal Summary for Video-based News Articles''☆33Updated 3 years ago
- Audio Visual Scene-Aware Dialog (AVSD) Challenge at the 10th Dialog System Technology Challenge (DSTC)☆27Updated 2 years ago
- The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".☆54Updated 2 years ago
- Humor Knowledge Enriched Transformer☆28Updated 2 years ago
- Code and dataset of "MEmoR: A Dataset for Multimodal Emotion Reasoning in Videos" in MM'20.☆50Updated last year
- [CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)☆57Updated 3 years ago
- ☆51Updated 2 years ago
- This repository aims to collect the articles and codes for the Visual Storytelling (VIST) task. VIST is a vision-and-language task. It ai…☆17Updated 3 years ago
- Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.☆49Updated 2 years ago
- A collection of models for image<->text generation in ACM MM 2021.☆64Updated 2 years ago
- [EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction☆47Updated 2 years ago
- Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"☆136Updated 2 years ago
- ☆27Updated 4 years ago
- Code for paper, "TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency" ECCV 2022☆34Updated last year
- Re-implementation of the work Livebot☆16Updated 4 years ago
- Data and code for CVPR 2020 paper: "VIOLIN: A Large-Scale Dataset for Video-and-Language Inference"☆158Updated 4 years ago
- [ACL 2021] mTVR: Multilingual Video Moment Retrieval☆26Updated 2 years ago
- Visual Storytelling with Cross-Modal Rules☆7Updated 4 years ago
- VisualCOMET: Reasoning about the Dynamic Context of a Still Image☆85Updated last year
- ☆45Updated last year
- ☆74Updated this week
- Use CLIP to represent video for Retrieval Task☆67Updated 3 years ago
- Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))☆56Updated last year
- PyTorch code for EMNLP 2020 paper "X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers"☆50Updated 3 years ago
- Hierarchical Question-Image Co-Attention for Visual Question Answering☆22Updated 5 years ago