SivanDoveh/DAC

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SivanDoveh/DAC)

SivanDoveh / DAC

Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models

☆28

Alternatives and similar repositories for DAC

Users that are interested in DAC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

UCSB-AI / ComCLIP
View on GitHub
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
☆37Aug 18, 2024Updated last year
marthaflinderslewis / clip-binding
View on GitHub
Code to reproduce the experiments in the paper: Does CLIP Bind Concepts? Probing Compositionality in Large Image Models.
☆16Oct 14, 2023Updated 2 years ago
SivanDoveh / TSVLC
View on GitHub
Repository for the paper: Teaching Structured Vision & Language Concepts to Vision & Language Models
☆47Sep 25, 2023Updated 2 years ago
facebookresearch / DCI
View on GitHub
Densely Captioned Images (DCI) dataset repository.
☆197Jul 1, 2024Updated 2 years ago
jmiemirza / MMFM-Challenge
View on GitHub
Official repository for the MMFM challenge
☆26Jun 18, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
amitakamath / vl_text_encoders_are_bottlenecks
View on GitHub
Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!
☆11May 24, 2023Updated 3 years ago
elisakreiss / concadia
View on GitHub
☆16Jan 3, 2023Updated 3 years ago
om-ai-lab / VL-CheckList
View on GitHub
Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]
☆138Apr 10, 2026Updated 3 months ago
tejas-gokhale / ALT
View on GitHub
☆13Dec 10, 2022Updated 3 years ago
mertyg / vision-language-models-are-bows
View on GitHub
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …
☆294Jun 7, 2023Updated 3 years ago
jmiemirza / Meta-Prompting
View on GitHub
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs (ECCV 2024)
☆20Jul 15, 2024Updated 2 years ago
junha1125 / Vision-Language-Model-in-ECCV-2024
View on GitHub
☆17Oct 1, 2024Updated last year
Annusha / xmic
View on GitHub
X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization, CVPR 2024
☆11Nov 7, 2024Updated last year
lezhang7 / Enhance-FineGrained
View on GitHub
[CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding
☆56Apr 7, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
WangFei-2019 / SNARE
View on GitHub
Project for SNARE benchmark
☆11Jun 5, 2024Updated 2 years ago
harshays / inputgradients
View on GitHub
Do input gradients highlight discriminative features? [NeurIPS 2021] (https://arxiv.org/abs/2102.12781)
☆12Jan 10, 2023Updated 3 years ago
uvavision / SyViC
View on GitHub
[ICCV 2023] Going Beyond Nouns With Vision & Language Models Using Synthetic Data
☆13Sep 30, 2023Updated 2 years ago
apple / ml-veclip
View on GitHub
The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"
☆253Jan 22, 2025Updated last year
RaptorMai / MLLM-CompBench
View on GitHub
[NeurIPS'25] MLLM-CompBench evaluates the comparative reasoning of MLLMs with 40K image pairs and questions across 8 dimensions of relati…
☆46Apr 21, 2025Updated last year
rabiulcste / vismin
View on GitHub
[NeurIPS24] VisMin: Visual Minimal-Change Understanding
☆19Mar 3, 2025Updated last year
arijitray1993 / COLA
View on GitHub
COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!
☆25May 14, 2026Updated 2 months ago
callsys / ControlCap
View on GitHub
[ECCV 2024] ControlCap: Controllable Region-level Captioning
☆81Oct 25, 2024Updated last year
giangnguyen2412 / advanced-XAI-for-DeepLearning
View on GitHub
Here I gather promising research directions to make DNNs interpretable
☆17Apr 11, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
tripletclip / TripletCLIP
View on GitHub
[NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"
☆48Dec 1, 2024Updated last year
ant-research / DreamLIP
View on GitHub
[ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions
☆138May 8, 2025Updated last year
jimmyxu123 / SELECT
View on GitHub
This is the repository for "SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Recognition"
☆16Oct 8, 2024Updated last year
vinid / neg_clip
View on GitHub
NegCLIP.
☆41Feb 6, 2023Updated 3 years ago
ugorsahin / Generative-Negative-Mining
View on GitHub
[WACV 2024] Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining, WACV 2024
☆13Jan 3, 2024Updated 2 years ago
callsys / DynRefer
View on GitHub
[CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution
☆59Mar 4, 2025Updated last year
RAIVNLab / CREPE
View on GitHub
[CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?
☆35Apr 27, 2023Updated 3 years ago
yonatanbitton / wysiwyr
View on GitHub
☆37Oct 7, 2023Updated 2 years ago
google-deepmind / svo_probes
View on GitHub
The SVO-Probes Dataset for Verb Understanding
☆29Jan 28, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
mlfoundations / imagenet-captions
View on GitHub
Release of ImageNet-Captions
☆51Jan 20, 2023Updated 3 years ago
ONground-Korea / 2023-AIKU_DeepLearning-Bootcamp
View on GitHub
2023-1 고려대학교 AIKU 딥러닝 방학 부트캠프: Deep into Deep
☆10Jul 10, 2023Updated 3 years ago
jiyounglee-0523 / VisAlign
View on GitHub
☆20Apr 23, 2024Updated 2 years ago
princetonvisualai / pointingqa
View on GitHub
Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"
☆19Oct 4, 2022Updated 3 years ago
KID-22 / Source-Bias
View on GitHub
Code for "Neural Retrievers are Biased Towards LLM-Generated Content"
☆14Oct 18, 2024Updated last year
hammoudhasan / SynthCLIP
View on GitHub
Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.
☆104Mar 23, 2025Updated last year
BatsResearch / ex2
View on GitHub
If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions
☆17Apr 4, 2024Updated 2 years ago