Sejong-VLI / V2T-Action-Graph-JKSUCIS-2023Links

The implementation of a paper entitled "Action Knowledge for Video Captioning with Graph Neural Networks" (JKSUCIS 2023).

☆13

Alternatives and similar repositories for V2T-Action-Graph-JKSUCIS-2023

Users that are interested in V2T-Action-Graph-JKSUCIS-2023 are comparing it to the libraries listed below

Sorting:

MarcusNerva / HMN
[CVPR2022] Official code for Hierarchical Modular Network for Video Captioning. Our proposed HMN is implemented with PyTorch.
☆50Updated 3 years ago
LiJiaBei-7 / rivrl
Source code of our TCSVT'22 paper Reading-strategy Inspired Visual Representation Learning for Text-to-Video Retrieval
☆19Updated 3 years ago
Huntersxsx / TSGV-Learning-List
Temporal Sentence Grounding in Videos / Natural Language Video Localization / Video Moment Retrieval的相关工作
☆29Updated 3 years ago
liupeng0606 / clip4caption
The first unofficial implementation of CLIP4Caption: CLIP for Video Caption (ACMMM 2021)
☆15Updated 2 years ago
foolwood / DRL
[arXiv22] Disentangled Representation Learning for Text-Video Retrieval
☆97Updated 3 years ago
Huntersxsx / MGPN
source code of our MGPN in SIGIR 2022
☆18Updated 3 years ago
ZiyueWu59 / CCA
☆15Updated last year
engindeniz / vitis
[ICCV 2023 CLVL Workshop] Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts
☆14Updated 9 months ago
ylqi / GL-RG
The code of IJCAI22 paper "GL-RG: Global-Local Representation Granularity for Video Captioning".
☆18Updated 2 years ago
crodriguezo / TMLGA
Repository of proposal-free temporal moment localization work
☆33Updated last year
sail-sg / VGT
Video Graph Transformer for Video Question Answering (ECCV'22)
☆48Updated 2 years ago
UARK-AICV / VLTinT
[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
☆68Updated last year
linjieli222 / HERO_Video_Feature_Extractor
Video Feature Extraction Code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
☆115Updated 4 years ago
sangminwoo / Explore-And-Match
Official pytorch implementation of "Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding …
☆42Updated 3 years ago
layer6ai-labs / xpool
https://layer6ai-labs.github.io/xpool/
☆129Updated 2 years ago
minghangz / cpl
CPL: Weakly Supervised Temporal Sentence Grounding with Gaussian-based Contrastive Proposal Learning
☆64Updated last year
HuiGuanLab / ms-sl
Source code of our MM'22 paper Partially Relevant Video Retrieval
☆54Updated 11 months ago
ttengwang / PDVC
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
☆224Updated last year
doc-doc / NExT-OE
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
☆30Updated 2 years ago
xiaoneil / LPNet
☆13Updated 3 years ago
kevinliang888 / IVR-QA-baselines
[ICCV 2023] Simple Baselines for Interactive Video Retrieval with Questions and Answers
☆16Updated last year
bofang98 / UATVR
[ICCV'23] UATVR: Uncertainty-Adaptive Text-Video Retrieval
☆13Updated last year
boheumd / A2Summ
The official implementation of 'Align and Attend: Multimodal Summarization with Dual Contrastive Losses' (CVPR 2023)
☆78Updated 2 years ago
HuiGuanLab / DL-DKD
ICCV'23 Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval
☆19Updated 2 months ago
nasib-ullah / video-captioning-models-in-Pytorch
A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.
☆73Updated 2 years ago
alexandrosXe / A-Simple-Baseline-For-Knowledge-Based-VQA
Repo for the EMNLP 2023 paper "A Simple Knowledge-Based Visual Question Answering"
☆25Updated last year
jssprz / video_captioning_datasets
Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Pe…
☆130Updated 2 years ago
Yaojie-Shen / CoCap
[ICCV 2023] Accurate and Fast Compressed Video Captioning
☆48Updated 3 months ago
ttgeng233 / UnAV
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
☆69Updated last year
26hzhang / ReLoCLNet
Video Corpus Moment Retrieval with Contrastive Learning (SIGIR 2021)
☆58Updated 4 years ago