tonychenxyz / vit-interpretLinks

Official implementation of "Interpreting and Controlling Vision Foundation Models via Text Explanations"

☆13

Alternatives and similar repositories for vit-interpret

Users that are interested in vit-interpret are comparing it to the libraries listed below

Sorting:

boyazeng / understand_bias
Code release for "Understanding Bias in Large-Scale Visual Datasets"
☆21Updated 10 months ago
OliverRensu / DeepMIM
[WACV2025 Oral] DeepMIM: Deep Supervision for Masked Image Modeling
☆54Updated 5 months ago
chenshuang-zhang / imagenet_d
[CVPR 2024 Highlight] ImageNet-D
☆44Updated last year
sheng-eatamath / S3A
repo for paper titled: Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment (AAAI'24 Oral)
☆25Updated last year
eslambakr / HRS_benchmark
☆61Updated 2 years ago
ZhangYuanhan-AI / visual_prompt_retrieval
[NeurIPS2023] Official implementation and model release of the paper "What Makes Good Examples for Visual In-Context Learning?"
☆179Updated last year
wjpoom / SPEC
[CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"
☆49Updated 4 months ago
hammoudhasan / SynthCLIP
Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.
☆100Updated 7 months ago
altndrr / vic
Code implementation of our NeurIPS 2023 paper: Vocabulary-free Image Classification
☆107Updated last year
k1rezaei / Text-to-concept
☆35Updated last year
ChengHan111 / VPT-or-FT
Official Pytorch implementation of 'Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning'? (ICLR2024)
☆13Updated last year
lisadunlap / ALIA
Augmenting with Language-guided Image Augmentation (ALIA)
☆81Updated last year
lorebianchi98 / FG-CLIP
[CBMI2024 Best Paper] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".
☆28Updated 5 months ago
UCSC-VLAA / CLIPS
An Enhanced CLIP Framework for Learning with Synthetic Captions
☆37Updated 6 months ago
hananshafi / llmblueprint
[ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"
☆82Updated last year
BAAI-DCAI / Training-Data-Synthesis
[ICLR 2024] Real-Fake: Effective Training Data Synthesis Through Distribution Matching
☆79Updated last year
arijitray1993 / COLA
COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!
☆24Updated 11 months ago
sjz5202 / LLaVA-Reward
Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
☆20Updated 2 months ago
rui-qian / READ
Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)
☆44Updated 2 weeks ago
Picsart-AI-Research / IPL-Zero-Shot-Generative-Model-Adaptation
[CVPR 2023] Zero-shot Generative Model Adaptation via Image-specific Prompt Learning
☆83Updated 2 years ago
linzhiqiu / visual_gpt_score
VisualGPTScore for visio-linguistic reasoning
☆27Updated 2 years ago
linzhiqiu / CLIP-FlanT5
Training code for CLIP-FlanT5
☆30Updated last year
facebookresearch / genecis
Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"
☆60Updated 2 years ago
FreedomIntelligence / TRIM
We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…
☆15Updated 10 months ago
syp2ysy / prompt-SelF
[TIP] Exploring Effective Factors for Improving Visual In-Context Learning
☆19Updated 3 months ago
1jsingh / Divide-Evaluate-and-Refine
Repo for our NeurIPS 2023 paper on: Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Fee…
☆26Updated last year
iancovert / locality-alignment
☆53Updated 9 months ago
yoctta / XPaste
☆52Updated 2 years ago
zycheiheihei / Transferable-Visual-Prompting
[CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…
☆44Updated 10 months ago
TencentARC / FLM
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
☆32Updated 2 years ago