jssprz/video_captioning_datasets

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jssprz/video_captioning_datasets)

jssprz / video_captioning_datasets

Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*

☆134

Alternatives and similar repositories for video_captioning_datasets

Users that are interested in video_captioning_datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jssprz / visual_syntactic_embedding_video_captioning
View on GitHub
Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*
☆30Apr 16, 2021Updated 5 years ago
shengyuzhang / Poet
View on GitHub
Poet: Product-oriented Video Captioner for E-commerce
☆12Sep 21, 2020Updated 5 years ago
forence / Awesome-Visual-Captioning
View on GitHub
This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP
☆410Nov 14, 2022Updated 3 years ago
syuqings / video-paragraph
View on GitHub
Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021
☆66Oct 21, 2021Updated 4 years ago
tgc1997 / RMN
View on GitHub
IJCAI2020: Learning to Discretely Compose Reasoning Module Networks for Video Captioning
☆79Nov 23, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SydCaption / SAAT
View on GitHub
☆62May 11, 2021Updated 5 years ago
ttengwang / PDVC
View on GitHub
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
☆230Jan 3, 2024Updated 2 years ago
facebookresearch / grounded-video-description
View on GitHub
Video Grounding and Captioning
☆331Oct 12, 2021Updated 4 years ago
nasib-ullah / video-captioning-models-in-Pytorch
View on GitHub
A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.
☆73Jul 30, 2023Updated 2 years ago
microsoft / SwinBERT
View on GitHub
Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"
☆251May 26, 2022Updated 4 years ago
willyfh / awesome-video-text-datasets
View on GitHub
A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or v…
☆40Feb 18, 2024Updated 2 years ago
VALUE-Leaderboard / DataRelease
View on GitHub
Data Release for VALUE Benchmark
☆30Feb 16, 2022Updated 4 years ago
dialogtekgeek / AVSD-DSTC10_Official
View on GitHub
Audio Visual Scene-Aware Dialog (AVSD) Challenge at the 10th Dialog System Technology Challenge (DSTC)
☆27Aug 19, 2022Updated 3 years ago
zinengtang / DeCEMBERT
View on GitHub
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
☆17Jan 12, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
chihyaoma / cyclical-visual-captioning
View on GitHub
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision
☆46Jul 29, 2020Updated 5 years ago
CASIA-IVA-Lab / VALOR
View on GitHub
[TPAMI2024] Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
☆311Dec 25, 2024Updated last year
RyanLiut / awesome-diverse-captioning
View on GitHub
Some papers about *diverse* image (a few videos) captioning
☆25Apr 4, 2023Updated 3 years ago
UARK-AICV / VLTinT
View on GitHub
[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
☆68Feb 16, 2024Updated 2 years ago
m-bain / frozen-in-time
View on GitHub
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
☆376May 19, 2022Updated 4 years ago
terry-r123 / Awesome-Captioning
View on GitHub
A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)
☆113Jun 6, 2022Updated 4 years ago
v-iashin / MDVC
View on GitHub
PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
☆144Apr 8, 2023Updated 3 years ago
simon-ging / coot-videotext
View on GitHub
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
☆291Sep 6, 2022Updated 3 years ago
yangbang18 / Non-Autoregressive-Video-Captioning
View on GitHub
The PyTorch code of the AAAI2021 paper "Non-Autoregressive Coarse-to-Fine Video Captioning".
☆57Oct 22, 2023Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
kyegomez / Mirasol
View on GitHub
Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"
☆26Jan 27, 2025Updated last year
tgc1997 / Awesome-Video-Captioning
View on GitHub
A curated list of research papers in Video Captioning
☆121Jan 5, 2021Updated 5 years ago
NingWang2049 / STIGPN
View on GitHub
Space-Time Interaction Graph Parsing Networks for Human-Object Interaction Recognition，ACM MM'21
☆14May 12, 2022Updated 4 years ago
xiaobai1217 / Awesome-Video-Datasets
View on GitHub
Video datasets
☆1,658Mar 8, 2023Updated 3 years ago
dmoltisanti / air-cvpr23
View on GitHub
This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…
☆13May 25, 2023Updated 3 years ago
jayleicn / recurrent-transformer
View on GitHub
[ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
☆170Dec 4, 2020Updated 5 years ago
Sejong-VLI / V2T-Action-Graph-JKSUCIS-2023
View on GitHub
The implementation of a paper entitled "Action Knowledge for Video Captioning with Graph Neural Networks" (JKSUCIS 2023).
☆14Mar 29, 2023Updated 3 years ago
m-bain / webvid
View on GitHub
Large-scale text-video dataset. 10 million captioned short videos.
☆685Aug 14, 2024Updated last year
xiadingZ / video-caption-openNMT.pytorch
View on GitHub
implement video caption based on openNMT
☆36Apr 19, 2018Updated 8 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
WingsBrokenAngel / MSR-VTT-DataCleaning
View on GitHub
☆19Dec 22, 2022Updated 3 years ago
google-research-datasets / Video-Timeline-Tags-ViTT
View on GitHub
A collection of videos annotated with timelines where each video is divided into segments, and each segment is labelled with a short free…
☆30Jan 15, 2022Updated 4 years ago
tingyaohsu / VIST-Edit
View on GitHub
Visual Storytelling post-edit dataset
☆18Sep 27, 2019Updated 6 years ago
RenShuhuai-Andy / TimeChat
View on GitHub
[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
☆425May 8, 2025Updated last year
jaeyun95 / pre-trained-vlk-model
View on GitHub
pre-trained vision and language model summary
☆12Apr 20, 2021Updated 5 years ago
LuoweiZhou / YouCook2-Leaderboard
View on GitHub
A one-stop shop for YouCook2 info such as leaderboard and recent advances on (cooking) video retrieval and captioning.
☆41Jun 29, 2022Updated 4 years ago
smartyfh / DST-ASSIST
View on GitHub
ASSIST: Towards Label Noise-Robust Dialogue State Tracking
☆10Apr 11, 2022Updated 4 years ago