mlfoundations / imagenet-captionsLinks

Release of ImageNet-Captions

☆51

Alternatives and similar repositories for imagenet-captions

Users that are interested in imagenet-captions are comparing it to the libraries listed below

Sorting:

NVlabs / PALAVRA
☆53Updated 3 years ago
redcaps-dataset / redcaps-downloader
Command-line tool for downloading and extending the RedCaps dataset.
☆49Updated last year
salesforce / MUST
PyTorch code for MUST
☆107Updated 5 months ago
mlfoundations / patching
Patching open-vocabulary models by interpolating weights
☆91Updated 2 years ago
goel-shashank / CyCLIP
☆120Updated 2 years ago
kakaobrain / noc
☆46Updated last year
facebookresearch / diht
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
☆138Updated 2 years ago
j-min / DallEval
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models (ICCV 2023)
☆142Updated 4 months ago
ZhangYuanhan-AI / OmniBenchmark
[ECCV2022] New benchmark for evaluating pre-trained model; New supervised contrastive learning framework.
☆108Updated last year
facebookresearch / OTTER
This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described …
☆69Updated 3 years ago
jmerullo / limber
https://arxiv.org/abs/2209.15162
☆52Updated 2 years ago
Weixin-Liang / MetaShift
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)
☆109Updated 3 years ago
RotsteinNoam / FuseCap
FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
☆55Updated last year
ryanwebster90 / snip-dedup
☆103Updated last year
MIMICLab / L-Verse
L-Verse: Bidirectional Generation Between Image and Text
☆109Updated 6 months ago
mehdidc / DALLE_clip_score
Simple script to compute CLIP-based scores given a DALL-e trained model.
☆30Updated 4 years ago
Hritikbansal / generative-robustness
Create generated datasets and train robust classifiers
☆36Updated 2 years ago
k1rezaei / Text-to-concept
☆35Updated last year
fkodom / clip-text-decoder
Generate text captions for images from their embeddings.
☆115Updated 2 years ago
facebookresearch / genecis
Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"
☆60Updated 2 years ago
google-research-datasets / videoCC-data
VideoCC is a dataset containing (video-URL, caption) pairs for training video-text machine learning models. It is created using an automa…
☆78Updated 2 years ago
facebookresearch / clip-rocket
Code release for "Improved baselines for vision-language pre-training"
☆61Updated last year
yonatanbitton / data_efficient_masked_language_modeling_for_vision_and_language
Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".
☆18Updated 4 years ago
mlfoundations / clip_quality_not_quantity
☆29Updated 3 years ago
LAION-AI / scaling-laws-openclip
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
☆178Updated 4 months ago
google-research / si-score
☆23Updated 4 months ago
facebookresearch / CiT
Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".
☆78Updated 2 years ago
MadryLab / dataset-interfaces
Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation
☆45Updated 2 years ago
eric-ai-lab / Discffusion
Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"
☆30Updated last year
yuhui-zh15 / drml
Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)
☆34Updated 2 years ago