umd-huang-lab / perceptionCLIPLinks

Code for our ICLR 2024 paper "PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts"

☆77

Alternatives and similar repositories for perceptionCLIP

Users that are interested in perceptionCLIP are comparing it to the libraries listed below

Sorting:

k1rezaei / Text-to-concept
☆34Updated last year
htqin / GoogleBard-VisUnderstand
How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges
☆30Updated last year
AtsuMiyai / UPD
[ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models
☆77Updated 2 months ago
eric-ai-lab / ComCLIP
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
☆36Updated 11 months ago
cliangyu / Cola
[NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"
☆105Updated last year
mlvlab / DAPT
Distribution-Aware Prompt Tuning for Vision-Language Models (ICCV 2023)
☆41Updated last year
AIFEG / BenchLMM
[ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
☆86Updated 11 months ago
Understanding-Visual-Datasets / VisDiff
Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)
☆120Updated last year
sehyunkwon / ICTC
This is a public repository for Image Clustering Conditioned on Text Criteria (IC|TC)
☆90Updated last year
iancovert / locality-alignment
☆51Updated 6 months ago
chenshuang-zhang / imagenet_d
[CVPR 2024 Highlight] ImageNet-D
☆43Updated 9 months ago
shubhamprshr27 / NeglectedTailsVLM
This repository houses the code for the paper - "The Neglected of VLMs"
☆28Updated 3 months ago
codezakh / LilT
[ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning
☆39Updated 2 years ago
hammoudhasan / SynthCLIP
Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.
☆100Updated 4 months ago
QUVA-Lab / PIN
Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs
☆26Updated 6 months ago
hananshafi / llmblueprint
[ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"
☆80Updated last year
wjpoom / SPEC
[CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"
☆45Updated last month
facebookresearch / genecis
Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"
☆59Updated 2 years ago
Christina200 / Online-LoRA-official
[WACV 2025] Official implementation of "Online-LoRA: Task-free Online Continual Learning via Low Rank Adaptation" by Xiwen Wei, Guihong L…
☆47Updated 8 months ago
UX-Decoder / FIND
[NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"
☆125Updated 11 months ago
Hritikbansal / videocon
☆58Updated last year
tripletclip / TripletCLIP
[NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"
☆41Updated 8 months ago
TencentARC / ViSFT
☆34Updated last year
isekai-portal / Link-Context-Learning
☆99Updated last year
FactoDeepLearning / MultitaskVLFM
☆26Updated 2 years ago
ZhangYuanhan-AI / visual_prompt_retrieval
[NeurIPS2023] Official implementation and model release of the paper "What Makes Good Examples for Visual In-Context Learning?"
☆177Updated last year
alhojel / visual_task_vectors
☆39Updated last year
mcahny / rovit
RO-ViT CVPR 2023 "Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers"
☆18Updated last year
kirill-vish / Beyond-INet
Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"
☆101Updated 10 months ago
j-min / VPGen
Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)
☆56Updated 2 years ago