sithu31296 / image-captioning
Simple and Easy to use Image Captioning Implementation
☆9Updated 3 years ago
Alternatives and similar repositories for image-captioning:
Users that are interested in image-captioning are comparing it to the libraries listed below
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆29Updated 2 years ago
- CLIP4IDC: CLIP for Image Difference Captioning (AACL 2022)☆34Updated 2 years ago
- (ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning☆35Updated 7 months ago
- ☆9Updated 2 years ago
- [ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model☆43Updated 3 months ago
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆24Updated 4 months ago
- A length-controllable and non-autoregressive image captioning model.☆68Updated 3 years ago
- ☆26Updated 3 years ago
- Official code for "Disentangling Visual Embeddings for Attributes and Objects" Published at CVPR 2022☆35Updated last year
- Using LLMs and pre-trained caption models for super-human performance on image captioning.☆40Updated last year
- Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. (CVPR 2023)☆60Updated 3 weeks ago
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆35Updated 7 months ago
- ☆83Updated 2 years ago
- ☆46Updated 3 years ago
- ☆19Updated last year
- [BMVC 2023 (Oral)] Official pytorch implementation of the paper: "Unsupervised Hashing with Similarity Distribution Calibration"☆18Updated last year
- Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"☆25Updated 2 years ago
- Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for …☆61Updated 2 years ago
- [SIGIR 2022] CenterCLIP: Token Clustering for Efficient Text-Video Retrieval. Also, a text-video retrieval toolbox based on CLIP + fast p…☆130Updated 2 years ago
- ☆29Updated last year
- [CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.☆30Updated last year
- Code for ICCV2021: Discovering Human Interactions with Large-Vocabulary Objects via Query and Multi-Scale Detection☆24Updated 3 years ago
- a py3 lib for NLP & image-caption metrics : BLEU METEOR CIDEr ROUGE SPICE WMD☆14Updated 2 years ago
- Referring Image Segmentation Benchmarking with Segment Anything Model (SAM)☆38Updated last year
- Use CLIP to represent video for Retrieval Task☆69Updated 4 years ago
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Updated 3 years ago
- ☆30Updated last year
- [ECCV'22 Poster] Explicit Image Caption Editing☆21Updated 2 years ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆38Updated last year
- [CVPR 2022] The code for our paper 《Object-aware Video-language Pre-training for Retrieval》☆62Updated 2 years ago