hammoudhasan / SynthCLIPLinks

Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.

☆100

Alternatives and similar repositories for SynthCLIP

Users that are interested in SynthCLIP are comparing it to the libraries listed below

Sorting:

OliverRensu / D-iGPT
[ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…
☆98Updated last year
WalBouss / MaskInversion
☆26Updated 9 months ago
chenshuang-zhang / imagenet_d
[CVPR 2024 Highlight] ImageNet-D
☆43Updated 9 months ago
facebookresearch / genecis
Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"
☆60Updated 2 years ago
Understanding-Visual-Datasets / VisDiff
Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)
☆120Updated last year
UX-Decoder / FIND
[NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"
☆125Updated 11 months ago
hananshafi / llmblueprint
[ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"
☆80Updated last year
lisadunlap / ALIA
Augmenting with Language-guided Image Augmentation (ALIA)
☆76Updated last year
ChenDelong1999 / subobjects
Official repository of paper "Subobject-level Image Tokenization" (ICML-25)
☆80Updated last month
WalBouss / GEM
[CVPR24] Official Implementation of GEM (Grounding Everything Module)
☆127Updated 3 months ago
UCSC-VLAA / Recap-DataComp-1B
[ICML 2025] This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"
☆138Updated last year
facebookresearch / meru
Code for the paper "Hyperbolic Image-Text Representations", Desai et al, ICML 2023
☆174Updated last year
OliverRensu / DeepMIM
[WACV2025 Oral] DeepMIM: Deep Supervision for Masked Image Modeling
☆53Updated 2 months ago
ExplainableML / ImageSelect
Code for the paper "If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection"
☆27Updated 2 years ago
hammoudhasan / DiversitySSL
Original code base for On Pretraining Data Diversity for Self-Supervised Learning
☆13Updated 7 months ago
alhojel / visual_task_vectors
☆39Updated last year
linzhiqiu / CLIP-FlanT5
Training code for CLIP-FlanT5
☆27Updated last year
RotsteinNoam / FuseCap
FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
☆55Updated last year
ziplab / SN-Netv2
[ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".
☆28Updated last year
tripletclip / TripletCLIP
[NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"
☆41Updated 8 months ago
eslambakr / HRS_benchmark
☆59Updated last year
zeyofu / BLINK_Benchmark
This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…
☆133Updated last year
ZhangYuanhan-AI / visual_prompt_retrieval
[NeurIPS2023] Official implementation and model release of the paper "What Makes Good Examples for Visual In-Context Learning?"
☆177Updated last year
renwang435 / video-ttt-release
Test-Time Training on Video Streams
☆64Updated 2 years ago
iancovert / locality-alignment
☆51Updated 6 months ago
j-min / VPGen
Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)
☆56Updated 2 years ago
mu-cai / matryoshka-mm
Matryoshka Multimodal Models
☆112Updated 6 months ago
Pepper-lll / LMforImageGeneration
Codebase for the paper-Elucidating the design space of language models for image generation
☆45Updated 8 months ago
wjpoom / SPEC
[CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"
☆45Updated last month
zhangjiewu / awesome-t2i-eval
A curated list of papers and resources for text-to-image evaluation.
☆30Updated last year