HAWLYQ / InfoMetICLinks
☆14Updated last year
Alternatives and similar repositories for InfoMetIC
Users that are interested in InfoMetIC are comparing it to the libraries listed below
Sorting:
- [CVPR 2023 & IJCV 2025] Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation☆64Updated last month
- ICCV 2023 (Oral) Open-domain Visual Entity Recognition Towards Recognizing Millions of Wikipedia Entities☆43Updated 2 months ago
- ☆24Updated last year
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆27Updated last year
- FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models☆30Updated 5 months ago
- NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)☆48Updated last year
- Official implementation of "ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing"☆74Updated last year
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆198Updated last year
- (ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning☆35Updated last year
- natual language guided image captioning☆85Updated last year
- Benchmark data for "Rethinking Benchmarks for Cross-modal Image-text Retrieval" (SIGIR 2023)☆25Updated 2 years ago
- ☆15Updated 3 years ago
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆36Updated last year
- code for "Multitask Vision-Language Prompt Tuning" https://arxiv.org/abs/2211.11720☆57Updated last year
- Colorful Prompt Tuning for Pre-trained Vision-Language Models☆49Updated 2 years ago
- LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation☆132Updated last year
- implementation of paper https://arxiv.org/abs/2210.04559☆54Updated 2 years ago
- MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning☆135Updated 2 years ago
- The SVO-Probes Dataset for Verb Understanding☆31Updated 3 years ago
- Official repository for the A-OKVQA dataset☆97Updated last year
- ☆37Updated last year
- Code and data for ImageCoDe, a contextual vison-and-language benchmark☆40Updated last year
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆206Updated 2 years ago
- [CVPR24 Highlights] Polos: Multimodal Metric Learning from Human Feedback for Image Captioning☆31Updated 3 months ago
- ☆40Updated 2 years ago
- Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retriev…☆38Updated 8 months ago
- Research code for "KAT: A Knowledge Augmented Transformer for Vision-and-Language"☆67Updated 3 years ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆40Updated 2 years ago
- The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".☆33Updated last year
- ☆16Updated 3 years ago