Aldenhovel / bleu-rouge-meteor-cider-spice-eval4imagecaption
Evaluation tools for image captioning. Including BLEU, ROUGE-L, CIDEr, METEOR, SPICE scores.
☆28Updated 2 years ago
Alternatives and similar repositories for bleu-rouge-meteor-cider-spice-eval4imagecaption:
Users that are interested in bleu-rouge-meteor-cider-spice-eval4imagecaption are comparing it to the libraries listed below
- SmallCap: Lightweight Image Captioning Prompted with Retrieval Augmentation☆101Updated last year
- 🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)☆64Updated last year
- An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"☆153Updated 11 months ago
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆189Updated last year
- [CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension☆48Updated 11 months ago
- natual language guided image captioning☆79Updated last year
- Implementation for CVPR 2022 paper " Injecting Semantic Concepts into End-to-End Image Captionin".☆42Updated 2 years ago
- Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning☆150Updated 2 years ago
- (ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning☆35Updated 7 months ago
- ☆57Updated last year
- Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for …☆61Updated 2 years ago
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆44Updated 7 months ago
- [CVPR23] A cascaded diffusion captioning model with a novel semantic-conditional diffusion process that upgrades conventional diffusion m…☆60Updated 9 months ago
- [ICLR 2025] VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning☆48Updated last month
- ☆62Updated last year
- Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Pe…☆121Updated last year
- An easy to use, user-friendly and efficient code for extracting OpenAI CLIP (Global/Grid) features from image and text respectively.☆122Updated 2 months ago
- NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)☆48Updated last year
- ☆84Updated 2 years ago
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆53Updated 9 months ago
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆133Updated last year
- Flickr30K Entities Dataset☆170Updated 6 years ago
- Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. (CVPR 2023)☆60Updated 3 weeks ago
- Official Code for 'RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words' (CVPR 2021)☆122Updated 2 years ago
- MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning☆136Updated last year
- Implementation code of the work "Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning"☆88Updated 3 months ago
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆130Updated 2 years ago
- Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]☆57Updated 2 years ago
- Code to train CLIP model☆107Updated 3 years ago
- MixGen: A New Multi-Modal Data Augmentation☆122Updated 2 years ago