The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"
☆252Jan 22, 2025Updated last year
Alternatives and similar repositories for ml-veclip
Users that are interested in ml-veclip are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆59Mar 14, 2024Updated 2 years ago
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models" ICLR 2024☆113Jun 11, 2024Updated last year
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆27Nov 29, 2023Updated 2 years ago
- Densely Captioned Images (DCI) dataset repository.☆198Jul 1, 2024Updated last year
- Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.☆103Mar 23, 2025Updated last year
- [ICML 2025] This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"☆149Jun 13, 2024Updated last year
- [ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"☆892Aug 13, 2024Updated last year
- This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.☆1,410Aug 4, 2025Updated 7 months ago
- [NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"☆288Jan 14, 2024Updated 2 years ago
- MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.☆954Mar 19, 2025Updated last year
- Tune-Mode ConvBN Blocks For Efficient Transfer Learning☆18Aug 1, 2023Updated 2 years ago
- [CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale☆214Feb 27, 2024Updated 2 years ago
- NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024☆1,826Nov 27, 2025Updated 3 months ago
- Tool for exporting Apple Neural Engine-accelerated versions of transformers models on HuggingFace Hub.☆13Mar 16, 2026Updated last week
- Load any clip model with a standardized interface☆22Oct 20, 2025Updated 5 months ago
- This repository contains the official implementation of the research papers, "MobileCLIP" CVPR 2024 and "MobileCLIP2" TMLR August 2025☆1,461Oct 9, 2025Updated 5 months ago
- Code for T-MARS data filtering☆35Aug 23, 2023Updated 2 years ago
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆159Dec 6, 2024Updated last year
- 4M: Massively Multimodal Masked Modeling☆1,791Jun 2, 2025Updated 9 months ago
- DataComp: In search of the next generation of multimodal datasets☆771Apr 28, 2025Updated 10 months ago
- ☆92Jan 4, 2024Updated 2 years ago
- LLM2CLIP significantly improves already state-of-the-art CLIP models.☆643Feb 1, 2026Updated last month
- Repository for the paper: Teaching Structured Vision & Language Concepts to Vision & Language Models☆47Sep 25, 2023Updated 2 years ago
- ☆23Oct 12, 2022Updated 3 years ago
- Self-Conditioning Pre-Trained Language Models, ICML 2022☆34Jul 12, 2022Updated 3 years ago
- ☆14Jul 2, 2024Updated last year
- [CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"☆212Jun 9, 2024Updated last year
- When do we not need larger vision models?☆415Feb 8, 2025Updated last year
- ☆13Feb 5, 2024Updated 2 years ago
- Grounded Language-Image Pre-training☆2,585Jan 24, 2024Updated 2 years ago
- An open source implementation of CLIP.☆13,528Mar 12, 2026Updated last week
- Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.☆3,388May 19, 2025Updated 10 months ago
- COYO-700M: Large-scale Image-Text Pair Dataset☆1,251Nov 30, 2022Updated 3 years ago
- ☆10Jul 5, 2024Updated last year
- [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"☆320Jun 3, 2024Updated last year
- [CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"☆53Jun 16, 2025Updated 9 months ago
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 3 months ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆138May 8, 2025Updated 10 months ago
- [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…☆81Sep 13, 2024Updated last year