This repository aims to collect the articles and codes for the Visual Storytelling (VIST) task. VIST is a vision-and-language task. It aims to summarize the idea of a photo stream and tells a story about it (in natural language). Be careful about its difference from the "storytelling with data", which is more related to data visualization.
☆25Mar 3, 2021Updated 5 years ago
Alternatives and similar repositories for Collection-of-Visual-Storytelling-StoryNLP
Users that are interested in Collection-of-Visual-Storytelling-StoryNLP are comparing it to the libraries listed below
Sorting:
- The implement of Commonsense Knowledge Aware Concept Selection For Diverse and Informative Visual Storytelling☆12Aug 19, 2021Updated 4 years ago
- Pytorch code for NODIS: Neural Ordinary Differential Scene Understanding, ECCV2020☆11Aug 28, 2020Updated 5 years ago
- Github repository for Plot and Rework: Modeling Storylines for Visual Storytelling (ACL-IJCNLP2021 Findings)☆22Aug 22, 2022Updated 3 years ago
- ☆15Aug 20, 2024Updated last year
- The code and output of our AAAI paper "Knowledge-Enriched Visual Storytelling"☆41May 3, 2021Updated 4 years ago
- vist story telling evaluation tool☆21Dec 5, 2023Updated 2 years ago
- A collection of videos annotated with timelines where each video is divided into segments, and each segment is labelled with a short free…☆29Jan 15, 2022Updated 4 years ago
- Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downs…☆32Mar 20, 2021Updated 4 years ago
- Matplotlib Image labeller for classifying images☆11Jan 5, 2026Updated 2 months ago
- Dedicated to Code in Place Spring 2021 with Stanford University or to those who are interested to learn Python for the first time☆12May 16, 2025Updated 9 months ago
- Implementation for the paper "Unified Multimodal Model with Unlikelihood Training for Visual Dialog"☆13May 12, 2023Updated 2 years ago
- Official implementation of CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation☆12Dec 5, 2025Updated 3 months ago
- Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization☆21Jan 27, 2026Updated last month
- ☆13Dec 1, 2025Updated 3 months ago
- Official PyTorch implementation for "Where You Edit is What You Get: Text-Guided Image Editing with Region-Based Attention" (Pattern Reco…☆10Oct 1, 2024Updated last year
- Repository for "CoMix: Comprehensive Benchmark for Multi-Task Comic Understanding"☆16Nov 20, 2024Updated last year
- Official implementation of "MambaPainter: Neural Stroke-Based Rendering in a Single Step."☆14Sep 29, 2024Updated last year
- Code repo for "SketchODE: Learning neural sketch representation in continuous time" published in ICLR 2022☆11Apr 19, 2022Updated 3 years ago
- A collection of resources on face swap.☆10Aug 26, 2022Updated 3 years ago
- ☆11Oct 12, 2016Updated 9 years ago
- [NeurIPS 2025] This is the official repository for "RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis"☆26Nov 21, 2025Updated 3 months ago
- EmoCapCLIP: Learning Transferable Facial Emotion Representations from Large-Scale Semantically Rich Captions☆20Jul 29, 2025Updated 7 months ago
- AI(Artificial Intelligence) can be used to generate Lyrics of songs!