xu-shitong / diffusion-image-captioning
implementation of paper https://arxiv.org/abs/2210.04559
☆54Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for diffusion-image-captioning
- ☆85Updated last year
- [CVPR23] A cascaded diffusion captioning model with a novel semantic-conditional diffusion process that upgrades conventional diffusion m…☆56Updated 5 months ago
- Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"☆73Updated last year
- ☆55Updated last year
- This repo contains codes and instructions for baselines in the VLUE benchmark.☆41Updated 2 years ago
- ☆61Updated last year
- (ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning☆35Updated 3 months ago
- Controllable mage captioning model with unsupervised modes☆21Updated last year
- Python 3 support for the MS COCO caption evaluation tools☆14Updated 5 months ago
- The SVO-Probes Dataset for Verb Understanding☆31Updated 2 years ago
- ☆43Updated last year
- Official repository for the A-OKVQA dataset☆64Updated 6 months ago
- ☆63Updated 5 years ago
- Implementation for CVPR 2022 paper " Injecting Semantic Concepts into End-to-End Image Captionin".☆41Updated 2 years ago
- Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".☆56Updated last year
- 📍 Official pytorch implementation of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS)☆48Updated last year
- This repo is the official implementation of UPL (Unsupervised Prompt Learning for Vision-Language Models).☆106Updated 2 years ago
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆25Updated 11 months ago
- Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)☆40Updated last year
- Retrieval-augmented Image Captioning☆12Updated last year
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆29Updated last year
- [ECCV'22 Poster] Explicit Image Caption Editing☆21Updated last year
- Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. CVPR 2023☆56Updated 3 weeks ago
- ViLLA: Fine-grained vision-language representation learning from real-world data☆40Updated last year
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆37Updated last year
- A curated list of zero-shot captioning papers☆21Updated last year
- MixGen: A New Multi-Modal Data Augmentation☆116Updated last year
- Colorful Prompt Tuning for Pre-trained Vision-Language Models☆46Updated 2 years ago
- [ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"☆65Updated 3 years ago
- CLIP4IDC: CLIP for Image Difference Captioning (AACL 2022)☆28Updated 2 years ago