xu-shitong/diffusion-image-captioning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xu-shitong/diffusion-image-captioning)

xu-shitong / diffusion-image-captioning

implementation of paper https://arxiv.org/abs/2210.04559

☆56

Alternatives and similar repositories for diffusion-image-captioning

Users that are interested in diffusion-image-captioning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

buxiangzhiren / DDCap
View on GitHub
☆85Dec 4, 2022Updated 3 years ago
jiahuei / sparse-image-captioning
View on GitHub
Image captioning with weight pruning in PyTorch
☆22Jan 14, 2022Updated 4 years ago
baaaad / ECE
View on GitHub
[ECCV'22 Poster] Explicit Image Caption Editing
☆22Nov 30, 2022Updated 3 years ago
aimagelab / PMA-Net
View on GitHub
[ICCV 2023] With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning.
☆19Jun 7, 2024Updated 2 years ago
multimodal-art-projection / IV-Bench
View on GitHub
☆14Apr 23, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Gitsamshi / WeakVRD-Captioning
View on GitHub
Implementation of paper "Improving Image Captioning with Better Use of Caption"
☆33Sep 15, 2020Updated 5 years ago
jacobswan1 / ViTCAP
View on GitHub
Implementation for CVPR 2022 paper " Injecting Semantic Concepts into End-to-End Image Captionin".
☆43May 28, 2022Updated 4 years ago
yegcjs / DINOISER
View on GitHub
☆26Jul 15, 2025Updated last year
SjokerLily / awesome-image-captioning
View on GitHub
A paper list of image captioning.
☆21Apr 23, 2022Updated 4 years ago
aimagelab / camel
View on GitHub
CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022
☆30Dec 1, 2022Updated 3 years ago
NovaMind-Z / PTSN
View on GitHub
Repository for an end-to-end image captioning method PTSN(ACM MM22).
☆60Dec 11, 2022Updated 3 years ago
bladewaltz1 / ModeCap
View on GitHub
Controllable mage captioning model with unsupervised modes
☆21Apr 14, 2023Updated 3 years ago
YuanEZhou / CBTrans
View on GitHub
☆24Apr 4, 2022Updated 4 years ago
XiangLi1999 / Diffusion-LM
View on GitHub
Diffusion-LM
☆1,245Aug 8, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
m-decoster / fpt4slt
View on GitHub
Frozen Pretrained Transformers for Neural Sign Language Translation
☆15Apr 23, 2022Updated 4 years ago
INK-USC / CALM
View on GitHub
Source code for ICLR 2021 paper : Pre-training Text-to-Text Transformers for Concept-Centric Common Sense
☆25Sep 16, 2021Updated 4 years ago
DavidHuji / CapDec
View on GitHub
CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
☆209Jan 28, 2024Updated 2 years ago
YuanEZhou / satic
View on GitHub
☆26Jun 25, 2021Updated 5 years ago
JegZheng / truncated-diffusion-probabilistic-models
View on GitHub
Pytorch implementation of TDPM
☆37Feb 26, 2023Updated 3 years ago
mpskex / AttentiveContrastiveLearningNetwork
View on GitHub
Code Release for Attentive Contrastive Learning Network for Fine Grained Visual Classification
☆12Jan 18, 2023Updated 3 years ago
princetonvisualai / SPICE-U
View on GitHub
☆11Sep 7, 2020Updated 5 years ago
232525 / PureT
View on GitHub
Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]
☆70Jun 1, 2024Updated 2 years ago
Yuanhy1997 / SeqDiffuSeq
View on GitHub
Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation [NAACL 2024]
☆99Aug 17, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
XuMengyaAmy / SwinMLP_TranCAP
View on GitHub
☆13Jun 26, 2022Updated 4 years ago
visinf / cos-cvae
View on GitHub
Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)
☆37May 16, 2022Updated 4 years ago
Shark-NLP / DiffuSeq
View on GitHub
[ICLR'23] DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models
☆838Mar 1, 2024Updated 2 years ago
SenJia / Position-Information
View on GitHub
How Much Position Information Do Convolutional Neural Networks Encode?
☆11Sep 20, 2021Updated 4 years ago
LCM-Lab / Bridge_Gap_Diffusion
View on GitHub
Diffusion Model Improvement Method
☆35Sep 4, 2023Updated 2 years ago
gwang-kim / DiffusionCLIP
View on GitHub
[CVPR 2022] Official PyTorch Implementation for DiffusionCLIP: Text-guided Image Manipulation Using Diffusion Models
☆867Mar 27, 2023Updated 3 years ago
zhjgao / difformer
View on GitHub
The official codebase for "Empowering Diffusion Models on the Embedding Space for Text Generation" (NAACL 2024)
☆56Apr 23, 2024Updated 2 years ago
hanlinwu / SADN
View on GitHub
Scale-aware Super-resolution Network
☆19Aug 28, 2024Updated last year
sejong-rcv / PVLR
View on GitHub
[ACM MM-24] Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization
☆13Oct 8, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
fkodom / clip-text-decoder
View on GitHub
Generate text captions for images from their embeddings.
☆119Aug 1, 2023Updated 2 years ago
eric-xw / Video-guided-Machine-Translation
View on GitHub
Starter code for the VMT task and challenge
☆51Jul 29, 2020Updated 5 years ago
xmu-xiaoma666 / SDATR
View on GitHub
Official Code for "Knowing what it is: Semantic-enhanced Dual Attention Transformer" (TMM2022)
☆19Oct 15, 2022Updated 3 years ago
feizc / DeeCap
View on GitHub
Dynamic Early Exit for Image Captioning
☆17Oct 25, 2022Updated 3 years ago
e-bug / fine-grained-evals
View on GitHub
[ACL 2023] Code and data for our paper "Measuring Progress in Fine-grained Vision-and-Language Understanding"
☆13Jun 11, 2023Updated 3 years ago
yoon307 / DiG
View on GitHub
Official respository for ECCV24 paper "Diffusion-Guided Weakly Supervised Semantic Segmentation"
☆19Dec 17, 2024Updated last year
ArikReuter / TNTM
View on GitHub
This repository contains the code for the Transformer-Representation Neural Topic Model (TNTM) based on the paper "Probabilistic Topic Mo…
☆12Jul 6, 2024Updated 2 years ago