Yaojie-Shen/CoCap

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Yaojie-Shen/CoCap)

Yaojie-Shen / CoCap

[ICCV 2023] Accurate and Fast Compressed Video Captioning

☆52

Alternatives and similar repositories for CoCap

Users that are interested in CoCap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MarcusNerva / HMN
View on GitHub
[CVPR2022] Official code for Hierarchical Modular Network for Video Captioning. Our proposed HMN is implemented with PyTorch.
☆50Sep 30, 2022Updated 3 years ago
ylqi / GL-RG
View on GitHub
The code of IJCAI22 paper "GL-RG: Global-Local Representation Granularity for Video Captioning".
☆18May 10, 2023Updated 3 years ago
Sejong-VLI / V2T-Action-Graph-JKSUCIS-2023
View on GitHub
The implementation of a paper entitled "Action Knowledge for Video Captioning with Graph Neural Networks" (JKSUCIS 2023).
☆14Mar 29, 2023Updated 3 years ago
microsoft / SwinBERT
View on GitHub
Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"
☆251May 26, 2022Updated 4 years ago
Nathan-Li123 / LaMOT
View on GitHub
[ICRA 2025] LaMOT: Language-Guided Multi-Object Tracking
☆30Feb 10, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hobincar / SGN
View on GitHub
Official pytorch implementation of the AAAI 2021 paper "Semantic Grouping Network for Video Captioning"
☆54Jul 9, 2021Updated 5 years ago
juneyeeHu / LM-EEC
View on GitHub
Robust Ego-Exo Correspondence with Long-Term Memory
☆15Dec 2, 2025Updated 7 months ago
UARK-AICV / VLTinT
View on GitHub
[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
☆68Feb 16, 2024Updated 2 years ago
ttengwang / PDVC
View on GitHub
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
☆230Jan 3, 2024Updated 2 years ago
HengLan / SMOT
View on GitHub
[ECCV 2024] Beyond MOT: Semantic Multi-Object Tracking
☆31Sep 12, 2024Updated last year
W-Wu / DEER
View on GitHub
☆12Aug 25, 2023Updated 2 years ago
HengLan / Awesome-Visual-Tracking
View on GitHub
Awesome Visual Tracking
☆25Oct 3, 2025Updated 9 months ago
dhg-wei / DeCap
View on GitHub
ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning
☆144Mar 16, 2023Updated 3 years ago
yangbang18 / Non-Autoregressive-Video-Captioning
View on GitHub
The PyTorch code of the AAAI2021 paper "Non-Autoregressive Coarse-to-Fine Video Captioning".
☆57Oct 22, 2023Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
bentoml / BentoSentenceTransformers
View on GitHub
how to build a sentence embedding application using BentoML
☆15Jul 14, 2026Updated 2 weeks ago
yiskw713 / VideoCaptioning
View on GitHub
video captioning using 3DCNN and LSTM (pytorch)
☆11Sep 26, 2019Updated 6 years ago
terry-r123 / Awesome-Captioning
View on GitHub
A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)
☆113Jun 6, 2022Updated 4 years ago
HengLan / PlanarTrack
View on GitHub
[ICCV 2023] PlanarTrack: A Large-scale Challenging Benchmark for Planar Object Tracking.
☆41Aug 26, 2023Updated 2 years ago
baoqianyue / DFC2021-Track-MSD
View on GitHub
Third place of 2021 IEEE GRSS Data Fusion Contest: Track MSD
☆10Mar 31, 2021Updated 5 years ago
VUT-HFUT / MAC_2024_baseline
View on GitHub
[MAC 2024] The baseline code for MAC 2024.
☆12Jun 3, 2025Updated last year
GX77 / TextKG
View on GitHub
☆11Jun 27, 2023Updated 3 years ago
intel / TVP
View on GitHub
☆15Aug 4, 2025Updated 11 months ago
deepgram-devs / prerecorded-audio-notebook
View on GitHub
☆13Nov 28, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
baoqianyue / Trick
View on GitHub
开发成长路上
☆10Dec 25, 2018Updated 7 years ago
X-PLUG / mPLUG-2
View on GitHub
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
☆227Jul 21, 2023Updated 3 years ago
zchoi / VCRN
View on GitHub
☆11Jul 11, 2023Updated 3 years ago
yangbang18 / MultiCapCLIP
View on GitHub
(ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
☆36Aug 8, 2024Updated last year
the-anonymous-bs / av-SALMONN
View on GitHub
av-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
☆13May 8, 2024Updated 2 years ago
HengLan / CGSTVG
View on GitHub
[CVPR 2024] Context-Guided Spatio-Temporal Video Grounding
☆66Jun 28, 2024Updated 2 years ago
Sreyan88 / CompA
View on GitHub
Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models
☆23Jul 10, 2024Updated 2 years ago
swagshaw / WildDESED
View on GitHub
WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection
☆18Nov 19, 2024Updated last year
gzhu06 / Cacophony
View on GitHub
Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986
☆49Jan 19, 2026Updated 6 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
nasib-ullah / video-captioning-models-in-Pytorch
View on GitHub
A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.
☆73Jul 30, 2023Updated 2 years ago
Dslab-NLP / Tibetan-PLM
View on GitHub
☆18Oct 8, 2023Updated 2 years ago
muyuuuu / XDU-report-LaTeX-template
View on GitHub
The LaTeX template of experiment report, XDU.
☆13Dec 7, 2020Updated 5 years ago
minghangz / OnVTG
View on GitHub
Online video temporal grounding
☆16Oct 20, 2025Updated 9 months ago
wanghao15536870732 / Android-programming-authority-guide
View on GitHub
🔨 安卓编程权威指南源码、笔记及挑战练习实现
☆11May 6, 2021Updated 5 years ago
android-nuc / 17-C-Train
View on GitHub
C training for 17 fresh man
☆14Oct 28, 2017Updated 8 years ago
sail-sg / Video-Next-Event-Prediction
View on GitHub
☆28Aug 9, 2025Updated 11 months ago