buxiangzhiren/DDCap

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/buxiangzhiren/DDCap)

buxiangzhiren / DDCap

☆85

Alternatives and similar repositories for DDCap

Users that are interested in DDCap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jianjieluo / SCD-Net
View on GitHub
[CVPR23] A cascaded diffusion captioning model with a novel semantic-conditional diffusion process that upgrades conventional diffusion m…
☆68Jun 11, 2024Updated 2 years ago
xu-shitong / diffusion-image-captioning
View on GitHub
implementation of paper https://arxiv.org/abs/2210.04559
☆56Nov 26, 2025Updated 8 months ago
luo3300612 / Transformer-Captioning
View on GitHub
Optimized code based on M2 for faster image captioning training
☆21Nov 18, 2022Updated 3 years ago
lvyufeng / uie_mindspore
View on GitHub
☆12Mar 21, 2024Updated 2 years ago
bladewaltz1 / ModeCap
View on GitHub
Controllable mage captioning model with unsupervised modes
☆21Apr 14, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
aimagelab / PMA-Net
View on GitHub
[ICCV 2023] With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning.
☆19Jun 7, 2024Updated 2 years ago
jiahuei / sparse-image-captioning
View on GitHub
Image captioning with weight pruning in PyTorch
☆22Jan 14, 2022Updated 4 years ago
jacobswan1 / ViTCAP
View on GitHub
Implementation for CVPR 2022 paper " Injecting Semantic Concepts into End-to-End Image Captionin".
☆43May 28, 2022Updated 4 years ago
232525 / PureT
View on GitHub
Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]
☆70Jun 1, 2024Updated 2 years ago
aimagelab / camel
View on GitHub
CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022
☆30Dec 1, 2022Updated 3 years ago
zhangxuying1004 / RSTNet
View on GitHub
Official Code for 'RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words' (CVPR 2021)
☆123Dec 17, 2022Updated 3 years ago
baaaad / ECE
View on GitHub
[ECCV'22 Poster] Explicit Image Caption Editing
☆22Nov 30, 2022Updated 3 years ago
YuanEZhou / CBTrans
View on GitHub
☆24Apr 4, 2022Updated 4 years ago
xmu-xiaoma666 / LSTNet
View on GitHub
Towards Local Visual Modeling for Image Captioning
☆30Mar 31, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
GT-RIPL / Xmodal-Ctx
View on GitHub
Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for …
☆61Oct 21, 2022Updated 3 years ago
mhh0318 / UniD3
View on GitHub
☆55Feb 9, 2023Updated 3 years ago
luo3300612 / image-captioning-DLCT
View on GitHub
Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
☆203Jun 8, 2022Updated 4 years ago
yahoo / object_relation_transformer
View on GitHub
Implementation of the Object Relation Transformer for Image Captioning
☆180Sep 17, 2024Updated last year
catfish132 / DiffusionRRG
View on GitHub
☆10Aug 24, 2023Updated 2 years ago
NovaMind-Z / PTSN
View on GitHub
Repository for an end-to-end image captioning method PTSN(ACM MM22).
☆60Dec 11, 2022Updated 3 years ago
terry-r123 / Awesome-Captioning
View on GitHub
A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)
☆113Jun 6, 2022Updated 4 years ago
jchenghu / ExpansionNet_v2
View on GitHub
Implementation code of the work "Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning"
☆96Dec 25, 2024Updated last year
Gitsamshi / WeakVRD-Captioning
View on GitHub
Implementation of paper "Improving Image Captioning with Better Use of Caption"
☆33Sep 15, 2020Updated 5 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
kakaobrain / noc
View on GitHub
☆47Apr 29, 2024Updated 2 years ago
SeleenaJM / CapEval
View on GitHub
An image-oriented evaluation tool for image captioning systems (EMNLP-IJCNLP 2019)
☆37May 3, 2020Updated 6 years ago
yuPeiyu98 / Latent-Diffusion-EBM
View on GitHub
[ICML 2022] Latent Diffusion Energy-Based Model for Interpretable Text Modeling
☆67Mar 1, 2026Updated 4 months ago
RyanLiut / awesome-diverse-captioning
View on GitHub
Some papers about *diverse* image (a few videos) captioning
☆25Apr 4, 2023Updated 3 years ago
Letian2003 / C-VQA
View on GitHub
Counterfactual Reasoning VQA Dataset
☆28Nov 23, 2023Updated 2 years ago
ChenyuHeidiZhang / VL-commonsense
View on GitHub
☆14May 23, 2022Updated 4 years ago
diggerdu / AudioMamba
View on GitHub
☆12Jun 1, 2024Updated 2 years ago
ShiYaya / emscore
View on GitHub
Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"
☆26Oct 20, 2022Updated 3 years ago
MCLAB-OCR / KnowledgeMiningWithSceneText
View on GitHub
☆38Feb 4, 2023Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
MikeWangWZHL / VidIL
View on GitHub
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆117Sep 15, 2022Updated 3 years ago
google-research / pix2seq
View on GitHub
Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)
☆945Nov 7, 2023Updated 2 years ago
zinengtang / DeCEMBERT
View on GitHub
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
☆17Jan 12, 2023Updated 3 years ago
libeineu / fairseq_mmt
View on GitHub
This code repository is for the accepted ACL2022 paper "On Vision Features in Multimodal Machine Translation". We provide the details and…
☆43Sep 16, 2022Updated 3 years ago
synlp / R2GenRL
View on GitHub
The code for our ACL-2022 paper titled "Reinforced Cross-modal Alignment for Radiology Report Generation"
☆30Nov 4, 2022Updated 3 years ago
nabihach / IDA
View on GitHub
☆13Jan 8, 2020Updated 6 years ago
mengqiDyangge / HierKD
View on GitHub
☆39Aug 25, 2022Updated 3 years ago