Data release for the ImageInWords (IIW) paper.
☆227Nov 17, 2024Updated last year
Alternatives and similar repositories for imageinwords
Users that are interested in imageinwords are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Densely Captioned Images (DCI) dataset repository.☆198Jul 1, 2024Updated last year
- Official Code for "Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning" (ICLR 2025)☆12Mar 6, 2025Updated last year
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆159Dec 6, 2024Updated last year
- [CVPR24 Highlights] Polos: Multimodal Metric Learning from Human Feedback for Image Captioning☆33May 25, 2025Updated 9 months ago
- NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024☆1,826Nov 27, 2025Updated 3 months ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆138May 8, 2025Updated 10 months ago
- ☆33Nov 4, 2024Updated last year
- LLM2CLIP significantly improves already state-of-the-art CLIP models.☆643Feb 1, 2026Updated last month
- ☆157Oct 31, 2024Updated last year
- Codebase for Aria - an Open Multimodal Native MoE☆1,086Jan 22, 2025Updated last year
- [ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"☆892Aug 13, 2024Updated last year
- ☆75Mar 7, 2024Updated 2 years ago
- CenterMask2 on detectron2 (open images)☆10May 28, 2020Updated 5 years ago
- Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series☆1,089Jan 21, 2025Updated last year
- This is the repository for the Photorealistic Unreal Graphics (PUG) datasets for representation learning.☆237Apr 4, 2024Updated last year
- Imagen-mini for girl image generation☆12Nov 19, 2022Updated 3 years ago
- [CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses tha…☆949Aug 5, 2025Updated 7 months ago
- When do we not need larger vision models?☆415Feb 8, 2025Updated last year
- ☆401Dec 12, 2024Updated last year
- ☆112Jan 8, 2025Updated last year
- Official implementation of SEED-LLaMA (ICLR 2024).☆642Sep 21, 2024Updated last year
- Code of paper "A new baseline for edge detection: Make Encoder-Decoder great again"☆40Jun 11, 2025Updated 9 months ago
- Cambrian-1 is a family of multimodal LLMs with a vision-centric design.☆1,995Nov 7, 2025Updated 4 months ago
- Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)☆186Jul 5, 2024Updated last year
- Lumina-T2X is a unified framework for Text to Any Modality Generation☆2,253Feb 16, 2025Updated last year
- Official implementation of TagAlign☆37Dec 11, 2024Updated last year
- Pipeline to scrape prompt + image url pairs from LAION `share-dalle-3` discord channel☆11Oct 10, 2023Updated 2 years ago
- ☆58Apr 24, 2024Updated last year
- ☆15May 13, 2024Updated last year
- Code for T-MARS data filtering☆35Aug 23, 2023Updated 2 years ago
- 1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundatio…☆226Aug 23, 2024Updated last year
- Grounded Language-Image Pre-training☆2,585Jan 24, 2024Updated 2 years ago
- RO-ViT CVPR 2023 "Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers"☆17Aug 24, 2023Updated 2 years ago
- EdgeSAM model for use with Autodistill.☆30Jun 11, 2024Updated last year
- [ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text☆415May 5, 2025Updated 10 months ago
- [ICLR 2025] Diffusion Feedback Helps CLIP See Better☆301Jan 23, 2025Updated last year
- ☆4,607Sep 14, 2025Updated 6 months ago
- Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.☆3,388May 19, 2025Updated 10 months ago
- [CVPR'25 - Rating 555] Official PyTorch implementation of Lumos: Learning Visual Generative Priors without Text☆53Mar 16, 2025Updated last year