wysoczanska/clip_dinoiser

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/wysoczanska/clip_dinoiser)

wysoczanska / clip_dinoiser

Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.

☆275

Alternatives and similar repositories for clip_dinoiser

Users that are interested in clip_dinoiser are comparing it to the libraries listed below

Sorting:

WalBouss / GEM
View on GitHub
[CVPR24] Official Implementation of GEM (Grounding Everything Module)
☆138Apr 10, 2025Updated 10 months ago
wangf3014 / SCLIP
View on GitHub
Official implementation of SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference
☆182Oct 10, 2024Updated last year
wysoczanska / clip-diy
View on GitHub
Official implementation of the WACV 2024 paper CLIP-DIY
☆34Dec 20, 2023Updated 2 years ago
dahyun-kang / lavg
View on GitHub
[ECCV'24] Official PyTorch implementation of In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
☆49Sep 24, 2024Updated last year
bytedance / fc-clip
View on GitHub
[NeurIPS 2023] This repo contains the code for our paper Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convoluti…
☆338Feb 5, 2024Updated 2 years ago
amitakamath / vl_text_encoders_are_bottlenecks
View on GitHub
Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!
☆11May 24, 2023Updated 2 years ago
sinahmr / NACLIP
View on GitHub
PyTorch Implementation of NACLIP in "Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation"
☆73Sep 23, 2024Updated last year
chongzhou96 / MaskCLIP
View on GitHub
Official PyTorch implementation of "Extract Free Dense Labels from CLIP" (ECCV 22 Oral)
☆468Sep 19, 2022Updated 3 years ago
SunzeY / AlphaCLIP
View on GitHub
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
☆867Jul 20, 2025Updated 7 months ago
cvlab-kaist / CAT-Seg
View on GitHub
Official Implementation of "CAT-Seg🐱: Cost Aggregation for Open-Vocabulary Semantic Segmentation"
☆364Apr 11, 2024Updated last year
YuchenLiu98 / COMM
View on GitHub
Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
☆207Jan 8, 2025Updated last year
Qinying-Liu / Awesome-Open-Vocabulary-Semantic-Segmentation
View on GitHub
A curated publication list on open vocabulary semantic segmentation and related area (e.g. zero-shot semantic segmentation) resources..
☆838Jan 20, 2026Updated last month
Jiawei-Yang / Denoising-ViT
View on GitHub
This is the official code release for our work, Denoising Vision Transformers.
☆395Nov 13, 2024Updated last year
HarborYuan / ovsam
View on GitHub
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
☆1,029Aug 4, 2025Updated 7 months ago
mc-lan / ProxyCLIP
View on GitHub
[ECCV2024] ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation
☆111Mar 26, 2025Updated 11 months ago
mbzuai-oryx / groundingLMM
View on GitHub
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses tha…
☆945Aug 5, 2025Updated 7 months ago
kirill-vish / Beyond-INet
View on GitHub
Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"
☆102Sep 11, 2024Updated last year
sung-yeon-kim / R-Adapter-ECCV2024
View on GitHub
Official PyTorch Implementation of Efficient and Versatile Robust Fine-Tuning of Zero-shot Models, ECCV 2024
☆17Oct 3, 2024Updated last year
UX-Decoder / DINOv
View on GitHub
[CVPR 2024] Official implementation of the paper "Visual In-context Learning"
☆529Apr 8, 2024Updated last year
amitakamath / whatsup_vlms
View on GitHub
Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".
☆71Feb 28, 2024Updated 2 years ago
ethanlshen / HierNet
View on GitHub
Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…
☆22Nov 8, 2023Updated 2 years ago
UX-Decoder / FIND
View on GitHub
[NeurIPS 2024] Official implementation of the paper "Interfacing Foundation Models' Embeddings"
☆131Aug 21, 2024Updated last year
bfshi / scaling_on_scales
View on GitHub
When do we not need larger vision models?
☆413Feb 8, 2025Updated last year
Xujxyang / OpenTrans
View on GitHub
☆24Apr 17, 2024Updated last year
Vibashan / PosSAM
View on GitHub
Official Repo for PosSAM: Panoptic Open-vocabulary Segment Anything
☆70Apr 7, 2024Updated last year
jianzongwu / Awesome-Open-Vocabulary
View on GitHub
(TPAMI 2024) A Survey on Open Vocabulary Learning
☆994Dec 24, 2025Updated 2 months ago
wusize / CLIPSelf
View on GitHub
[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
☆201Feb 5, 2024Updated 2 years ago
mlpc-ucsd / MasQCLIP
View on GitHub
(ICCV 2023) MasQCLIP for Open-Vocabulary Universal Image Segmentation
☆37Oct 18, 2023Updated 2 years ago
MendelXu / SAN
View on GitHub
Open-vocabulary Semantic Segmentation
☆375Oct 16, 2024Updated last year
lambert-x / ProLab
View on GitHub
Official Pytorch Implementation of Paper "A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Des…
☆55Aug 27, 2025Updated 6 months ago
kevin-ssy / CLIP_as_RNN
View on GitHub
Official Implementation for CVPR 2024 paper: CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor
☆111Jun 23, 2024Updated last year
mc-lan / ClearCLIP
View on GitHub
[ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference
☆97Mar 26, 2025Updated 11 months ago
xmed-lab / CLIP_Surgery
View on GitHub
[Pattern Recognition 25] CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks
☆467Mar 1, 2025Updated last year
mbzuai-oryx / Video-LLaVA
View on GitHub
PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
☆262Aug 5, 2025Updated 7 months ago
beichenzbc / Long-CLIP
View on GitHub
[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"
☆892Aug 13, 2024Updated last year
mlpc-ucsd / MaskCLIP
View on GitHub
Code Release for MaskCLIP (ICML 2023)
☆76Nov 29, 2023Updated 2 years ago
Qinying-Liu / TagAlign
View on GitHub
Official implementation of TagAlign
☆35Dec 11, 2024Updated last year
jmiemirza / LaFTer
View on GitHub
LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections (NeurIPS 2023)
☆29Dec 27, 2023Updated 2 years ago
YuHengsss / Trident
View on GitHub
[ICCV2025] Harnessing CLIP, DINO and SAM for Open Vocabulary Segmentation
☆113Nov 22, 2025Updated 3 months ago