SjokerLily/awesome-image-captioning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SjokerLily/awesome-image-captioning)

SjokerLily / awesome-image-captioning

A paper list of image captioning.

☆21

Alternatives and similar repositories for awesome-image-captioning

Users that are interested in awesome-image-captioning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jiahuei / sparse-image-captioning
View on GitHub
Image captioning with weight pruning in PyTorch
☆22Jan 14, 2022Updated 4 years ago
YuanEZhou / CBTrans
View on GitHub
☆24Apr 4, 2022Updated 4 years ago
malihealikhani / Cross-modal_Coherence_Modeling
View on GitHub
Cross-modal Coherence Modeling for Caption Generation
☆11Jul 24, 2020Updated 5 years ago
GT-RIPL / Xmodal-Ctx
View on GitHub
Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for …
☆61Oct 21, 2022Updated 3 years ago
fawazsammani / look-and-modify
View on GitHub
Look and Modify: Modification Networks for Image Captioning, BMVC 2019
☆21Feb 18, 2020Updated 6 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
zhangxuying1004 / RSTNet
View on GitHub
Official Code for 'RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words' (CVPR 2021)
☆123Dec 17, 2022Updated 3 years ago
Gitsamshi / WeakVRD-Captioning
View on GitHub
Implementation of paper "Improving Image Captioning with Better Use of Caption"
☆33Sep 15, 2020Updated 5 years ago
terry-r123 / Awesome-Captioning
View on GitHub
A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)
☆113Jun 6, 2022Updated 4 years ago
xmu-xiaoma666 / SDATR
View on GitHub
Official Code for "Knowing what it is: Semantic-enhanced Dual Attention Transformer" (TMM2022)
☆19Oct 15, 2022Updated 3 years ago
aimagelab / camel
View on GitHub
CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022
☆30Dec 1, 2022Updated 3 years ago
quangvnai / grit
View on GitHub
GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)
☆199May 9, 2023Updated 3 years ago
CrossmodalGroup / ER-SAN
View on GitHub
Implementation of our IJCAI2022 oral paper, ER-SAN: Enhanced-Adaptive Relation Self-Attention Network for Image Captioning.
☆25Aug 5, 2023Updated 2 years ago
luo3300612 / Transformer-Captioning
View on GitHub
Optimized code based on M2 for faster image captioning training
☆21Nov 18, 2022Updated 3 years ago
marcopede / AreasOfAttention
View on GitHub
☆10Apr 20, 2018Updated 8 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yaolinli / IDC
View on GitHub
☆30Oct 19, 2022Updated 3 years ago
fawazsammani / show-edit-tell
View on GitHub
Show, Edit and Tell: A Framework for Editing Image Captions, CVPR 2020
☆82Jul 17, 2020Updated 6 years ago
232525 / PureT
View on GitHub
Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]
☆70Jun 1, 2024Updated 2 years ago
weimingboya / DFT
View on GitHub
☆13Jun 2, 2023Updated 3 years ago
sushizixin / CLIP4IDC
View on GitHub
CLIP4IDC: CLIP for Image Difference Captioning (AACL 2022)
☆36Nov 12, 2022Updated 3 years ago
yahoo / object_relation_transformer
View on GitHub
Implementation of the Object Relation Transformer for Image Captioning
☆180Sep 17, 2024Updated last year
husthuaan / AoANet
View on GitHub
Code for paper "Attention on Attention for Image Captioning". ICCV 2019
☆339May 2, 2021Updated 5 years ago
mrwu-mac / DIFNet
View on GitHub
[CVPR 2022] This repository is for the paper ``DIFNet: Boosting Visual Information Flow for Image Captioning'' .
☆21Nov 28, 2022Updated 3 years ago
hwanheelee1993 / ViLBERTScore
View on GitHub
Code for ViLBERTScore in EMNLP Eval4NLP
☆18Oct 27, 2022Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
luo3300612 / image-captioning-DLCT
View on GitHub
Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
☆203Jun 8, 2022Updated 4 years ago
xu-shitong / diffusion-image-captioning
View on GitHub
implementation of paper https://arxiv.org/abs/2210.04559
☆56Nov 26, 2025Updated 7 months ago
siwooyong / Codalab-Microsoft-COCO-Image-Captioning-Challenge
View on GitHub
🥉 Codalab-Microsoft-COCO-Image-Captioning-Challenge 3rd place solution(06.30.21)
☆23Apr 6, 2022Updated 4 years ago
fenglinliu98 / MIA
View on GitHub
Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" （NeurIPS 2019）
☆65Oct 19, 2020Updated 5 years ago
jacobswan1 / ViTCAP
View on GitHub
Implementation for CVPR 2022 paper " Injecting Semantic Concepts into End-to-End Image Captionin".
☆43May 28, 2022Updated 4 years ago
ezeli / BUTD_model
View on GitHub
A pytorch implementation of "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering" for image captioning.
☆48Nov 15, 2021Updated 4 years ago
zchoi / S2-Transformer
View on GitHub
[IJCAI 2022] Official Pytorch code for paper “S2 Transformer for Image Captioning”
☆86Aug 14, 2024Updated last year
JDAI-CV / image-captioning
View on GitHub
Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]
☆273Jul 27, 2021Updated 4 years ago
wtliao / ImageTransformer
View on GitHub
Image Captioning through Image Transformer
☆40Dec 29, 2020Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
aurooj / WSG-VQA-VLTransformers
View on GitHub
Weakly Supervised Grounding for VQA in Vision-Language Transformers
☆16May 6, 2023Updated 3 years ago
evanmiltenburg / MeasureDiversity
View on GitHub
Measure the diversity of image descriptions, repository for our COLING 2018 paper.
☆13Dec 29, 2019Updated 6 years ago
visinf / cos-cvae
View on GitHub
Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)
☆37May 16, 2022Updated 4 years ago
aimagelab / meshed-memory-transformer
View on GitHub
Meshed-Memory Transformer for Image Captioning. CVPR 2020
☆546Dec 21, 2022Updated 3 years ago
husthuaan / AAT
View on GitHub
Code for paper "Adaptively Aligned Image Captioning via Adaptive Attention Time". NeurIPS 2019
☆50Dec 18, 2019Updated 6 years ago
xiyan-fu / MM-AVS
View on GitHub
A Full-Scale Dataset for Multi-modal Summarization
☆16Dec 8, 2021Updated 4 years ago
ruotianluo / self-critical.pytorch
View on GitHub
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
☆1,003Oct 5, 2023Updated 2 years ago