facebookresearch / SIEVELinks

SIEVE: Multimodal Dataset Pruning using Image-Captioning Models (CVPR 2024)

☆16

Alternatives and similar repositories for SIEVE

Users that are interested in SIEVE are comparing it to the libraries listed below

Sorting:

Optimization-AI / FastCLIP
Distributed Optimization Infra for learning CLIP models
☆26Updated 9 months ago
mshukor / eP-ALM
[ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.
☆27Updated last year
ethanlshen / HierNet
Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…
☆21Updated last year
jeykigung / HiCLIP
☆29Updated 2 years ago
jonkahana / CLIPPR
An official PyTorch implementation for CLIPPR
☆29Updated last year
ggjy / vision_weak_to_strong
☆38Updated last year
jmerullo / limber
https://arxiv.org/abs/2209.15162
☆50Updated 2 years ago
ExplainableML / fomo_in_flux
Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]
☆57Updated 7 months ago
hammoudhasan / SynthCLIP
Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.
☆100Updated 3 months ago
elad-amrani / xtra
PyTorch implementation of "Sample- and Parameter-Efficient Auto-Regressive Image Models" from CVPR 2025
☆12Updated 4 months ago
aszala / VPEval
VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)
☆45Updated last year
hammoudhasan / DiversitySSL
Original code base for On Pretraining Data Diversity for Self-Supervised Learning
☆13Updated 6 months ago
eric-ai-lab / Discffusion
Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"
☆29Updated last year
k1rezaei / Text-to-concept
☆33Updated last year
ilkerkesen / ViLMA
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)
☆16Updated last year
allenai / grit_official
Official repository for the General Robust Image Task (GRIT) Benchmark
☆54Updated 2 years ago
AtsuMiyai / UPD
[ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models
☆77Updated last month
philippe-eecs / small-vision
A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.
☆34Updated last year
MengLcool / DeepStack-VL
[NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…
☆37Updated last year
amitakamath / vl_text_encoders_are_bottlenecks
Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!
☆11Updated 2 years ago
showlab / datacentric.vlp
Compress conventional Vision-Language Pre-training data
☆51Updated last year
ml-jku / semantic-image-text-alignment
☆24Updated 2 years ago
drimpossible / ACM
Codebase for adaptive continual memory
☆13Updated last year
locuslab / llava-token-compression
☆42Updated 8 months ago
showlab / cosmo
☆71Updated last year
OliverRensu / D-iGPT
[ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Lea…
☆98Updated last year
HanSolo9682 / CounterCurate
This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.
☆18Updated last year
facebookresearch / maws
Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496
☆91Updated 3 months ago
codezakh / LilT
[ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning
☆39Updated last year
facebookresearch / genecis
Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"
☆59Updated 2 years ago