Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.
☆418Jul 14, 2025Updated 7 months ago
Alternatives and similar repositories for conceptual-12m
Users that are interested in conceptual-12m are comparing it to the libraries listed below
Sorting:
- Conceptual Captions is a dataset containing (image-URL, caption) pairs designed for the training and evaluation of machine learned image …☆563Aug 21, 2021Updated 4 years ago
- WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique imag…☆1,101Sep 27, 2024Updated last year
- Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.☆4,372Oct 19, 2025Updated 4 months ago
- PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)☆374Jul 29, 2023Updated 2 years ago
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm☆675Sep 19, 2022Updated 3 years ago
- [CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations☆565Aug 22, 2025Updated 6 months ago
- COYO-700M: Large-scale Image-Text Pair Dataset☆1,252Nov 30, 2022Updated 3 years ago
- Code release for SLIP Self-supervision meets Language-Image Pre-training☆787Feb 9, 2023Updated 3 years ago
- Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"☆800Jun 30, 2021Updated 4 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆209Dec 18, 2022Updated 3 years ago
- Oscar and VinVL☆1,053Aug 28, 2023Updated 2 years ago
- Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER…☆119Jan 13, 2021Updated 5 years ago
- project page for VinVL☆359Jul 26, 2023Updated 2 years ago
- [CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…☆729Aug 8, 2023Updated 2 years ago
- [CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning