google / imageinwords
Data release for the ImageInWords (IIW) paper.
☆209Updated 5 months ago
Alternatives and similar repositories for imageinwords:
Users that are interested in imageinwords are comparing it to the libraries listed below
- [ICML 2025] This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"☆129Updated 10 months ago
- a family of highly capabale yet efficient large multimodal models☆179Updated 8 months ago
- Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"☆146Updated last month
- ☆171Updated last year
- Matryoshka Multimodal Models☆101Updated 3 months ago
- [CVPR 2024] VCoder: Versatile Vision Encoders for Multimodal Large Language Models☆278Updated last year
- [CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale☆209Updated last year
- ☆176Updated 6 months ago
- The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"☆245Updated 3 months ago
- A one-stop library to standardize the inference and evaluation of all the conditional image generation models. (ICLR 2024)☆168Updated 3 weeks ago
- [ECCV 2024] Official PyTorch implementation of "Getting it Right: Improving Spatial Consistency in Text-to-Image Models"☆99Updated 10 months ago
- ☆85Updated last year
- GenEval: An object-focused framework for evaluating text-to-image alignment