yaojie-shen / CoCap
View external linksLinks

[ICCV 2023] Accurate and Fast Compressed Video Captioning

☆52

Alternatives and similar repositories for CoCap

Users that are interested in CoCap are comparing it to the libraries listed below

Sorting:

MarcusNerva / HMN
View on GitHub
[CVPR2022] Official code for Hierarchical Modular Network for Video Captioning. Our proposed HMN is implemented with PyTorch.
☆50Sep 30, 2022Updated 3 years ago
liupeng0606 / clip4caption
View on GitHub
The first unofficial implementation of CLIP4Caption: CLIP for Video Caption (ACMMM 2021)
☆15Jan 2, 2023Updated 3 years ago
ylqi / GL-RG
View on GitHub
The code of IJCAI22 paper "GL-RG: Global-Local Representation Granularity for Video Captioning".
☆18May 10, 2023Updated 2 years ago
microsoft / SwinBERT
View on GitHub
Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"
☆247May 26, 2022Updated 3 years ago
bentoml / BentoSentenceTransformers
View on GitHub
how to build a sentence embedding application using BentoML
☆14Mar 31, 2025Updated 10 months ago
baoqianyue / Trick
View on GitHub
开发成长路上
☆10Dec 25, 2018Updated 7 years ago
hkust-nlp / model-task-align-rl
View on GitHub
[ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".
☆15Updated this week
baoqianyue / DFC2021-Track-MSD
View on GitHub
Third place of 2021 IEEE GRSS Data Fusion Contest: Track MSD
☆10Mar 31, 2021Updated 4 years ago
UARK-AICV / VLTinT
View on GitHub
[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
☆68Feb 16, 2024Updated last year
android-nuc / 17_Java_Train
View on GitHub
17级 Android 实验室 Java 培训
☆12Nov 17, 2017Updated 8 years ago
GX77 / TextKG
View on GitHub
☆11Jun 27, 2023Updated 2 years ago
yangbang18 / CARE
View on GitHub
(TIP'2023) Concept-Aware Video Captioning: Describing Videos with Effective Prior Information
☆32Dec 26, 2024Updated last year
muyuuuu / XDU-report-LaTeX-template
View on GitHub
The LaTeX template of experiment report, XDU.
☆13Dec 7, 2020Updated 5 years ago
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated last year
Nathan-Li123 / LaMOT
View on GitHub
[ICRA 2025] LaMOT: Language-Guided Multi-Object Tracking
☆29Feb 10, 2025Updated last year
wanghao15536870732 / Android-programming-authority-guide
View on GitHub
🔨 安卓编程权威指南源码、笔记及挑战练习实现
☆11May 6, 2021Updated 4 years ago
Arcee-LYK / Multi-to-Single
View on GitHub
The official code of paper "Multi-to-Single: Reducing Multimodal Dependency in Emotion Recognition through Contrastive Learning" (AAAI 20…
☆30Sep 30, 2025Updated 4 months ago
W-Wu / DEER
View on GitHub
☆12Aug 25, 2023Updated 2 years ago
yangbang18 / MultiCapCLIP
View on GitHub
(ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
☆36Aug 8, 2024Updated last year
tianyi-lab / R2-T2
View on GitHub
[ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"
☆19Mar 10, 2025Updated 11 months ago
swagshaw / WildDESED
View on GitHub
WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection
☆17Nov 19, 2024Updated last year
VUT-HFUT / MAC_2024_baseline
View on GitHub
[MAC 2024] The baseline code for MAC 2024.
☆12Jun 3, 2025Updated 8 months ago
intel / TVP
View on GitHub
☆15Aug 4, 2025Updated 6 months ago
hobincar / SGN
View on GitHub
Official pytorch implementation of the AAAI 2021 paper "Semantic Grouping Network for Video Captioning"
☆54Jul 9, 2021Updated 4 years ago
dhg-wei / DeCap
View on GitHub
ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning
☆138Mar 16, 2023Updated 2 years ago
baoqianyue / StudyNotes
View on GitHub
学习笔记
☆17Jul 9, 2019Updated 6 years ago
android-nuc / 17-C-Train
View on GitHub
C training for 17 fresh man
☆14Oct 28, 2017Updated 8 years ago
Sreyan88 / CompA
View on GitHub
Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models
☆22Jul 10, 2024Updated last year
terry-r123 / Awesome-Captioning
View on GitHub
A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)
☆112Jun 6, 2022Updated 3 years ago
Angus-Liu / Notes
View on GitHub
New starting point, keep trying. 平时学习时记录的一些东西，对他人参考价值有限。最近在重新规划，建一个知识库，会涵盖更多的知识面。链接 👉 https://github.com/Angus-Liu/mtbox
☆17Jul 7, 2023Updated 2 years ago
deep-real / DEAL
View on GitHub
The PyTorch implementation for "DEAL: Disentangle and Localize Concept-level Explanations for VLMs" (ECCV 2024 Strong Double Blind)
☆20Nov 7, 2024Updated last year
QxLabIreland / AQP
View on GitHub
☆24Jun 13, 2022Updated 3 years ago
hulianyuyy / iLLaVA
View on GitHub
iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models
☆21Jan 29, 2025Updated last year
joeyz0z / ConZIC
View on GitHub
Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"
☆75Sep 20, 2023Updated 2 years ago
ttengwang / PDVC
View on GitHub
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
☆228Jan 3, 2024Updated 2 years ago
ioanacroi / longmoment-detr
View on GitHub
Moment Detection in Long Tutorial Videos
☆20May 8, 2024Updated last year
princetonvisualai / merv
View on GitHub
Unifying Specialized Visual Encoders for Video Language Models
☆25Nov 22, 2025Updated 2 months ago
pengzhansun / Counterfactual-Debiasing-Network
View on GitHub
[ACM MM 2021] A causal perspective for compositional action recognition, providing a counterfactual debiasing inference implementation to…
☆20May 5, 2022Updated 3 years ago
android-nuc / 18-C-Train
View on GitHub
18级Android实验室(人工智能+移动互联) C语言培训
☆16Dec 8, 2018Updated 7 years ago

yaojie-shen / CoCapView external linksLinks

Alternatives and similar repositories for CoCap

yaojie-shen / CoCap
View external linksLinks