Show-han/Zeroshot_REC

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Show-han/Zeroshot_REC)

Show-han / Zeroshot_REC

Official code for Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions (CVPR 2024)

☆28

Alternatives and similar repositories for Zeroshot_REC

Users that are interested in Zeroshot_REC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kingthreestones / RefCLIP
View on GitHub
☆39Jun 28, 2023Updated 3 years ago
linxin0 / HQGS
View on GitHub
☆13Apr 28, 2025Updated last year
liuting20 / DARA
View on GitHub
[ICME 2024 Oral] DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding
☆22Feb 26, 2025Updated last year
VoyageWang / IteRPrimE
View on GitHub
The official implementation of our paper ''IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Prima…
☆20Apr 6, 2025Updated last year
CASIA-IVA-Lab / SC-Tune
View on GitHub
Official code for CVPR 2024 paper, "SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models"
☆16Apr 22, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
uvavision / SelfEQ
View on GitHub
[CVPR 2024] Code for "Improved Visual Grounding through Self-Consistent Explanations".
☆28Mar 1, 2024Updated 2 years ago
linhuixiao / HiVG
View on GitHub
[ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.
☆65Nov 10, 2025Updated 8 months ago
CurryYuan / ZSVG3D
View on GitHub
[CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
☆64Aug 3, 2024Updated last year
facebookresearch / SIEVE
View on GitHub
SIEVE: Multimodal Dataset Pruning using Image-Captioning Models (CVPR 2024)
☆21Apr 28, 2024Updated 2 years ago
fhgyuanshen / HybridGL
View on GitHub
[CVPR 2025] Hybrid Global-Local Representation with Augmented Spatial Guidance for Zero-Shot Referring Image Segmentation
☆37Jun 27, 2025Updated last year
mc-lan / ClearCLIP
View on GitHub
[ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference
☆100Mar 26, 2025Updated last year
Seonghoon-Yu / Zero-shot-RIS
View on GitHub
[CVPR 2023] Official code for "Zero-shot Referring Image Segmentation with Global-Local Context Features"
☆130Mar 17, 2025Updated last year
WangFei-2019 / SNARE
View on GitHub
Project for SNARE benchmark
☆11Jun 5, 2024Updated 2 years ago
zhu-xlab / rrsis
View on GitHub
☆22Jul 15, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
xfactlab / I0T
View on GitHub
[ACL Main 2025] I0T: Embedding Standardization Method Towards Zero Modality Gap
☆12Jun 18, 2025Updated last year
zjh31 / CPL
View on GitHub
☆21Apr 2, 2024Updated 2 years ago
leaves162 / CLIPtrase
View on GitHub
cliptrase
☆47Sep 1, 2024Updated last year
naver-ai / muco
View on GitHub
Official Pytorch implementation of MuCo: Multi-turn Contrastive Learning for Multimodal Embedding Model (CVPR 2026)
☆15Apr 16, 2026Updated 3 months ago
zshyang / amg
View on GitHub
☆12Sep 15, 2024Updated last year
xulingjing88 / WSMA
View on GitHub
[AAAI 2024]Weakly Supervised Multimodal Affordance Grounding for Egocentric Images
☆13Nov 10, 2024Updated last year
seanzhuh / SeqTR
View on GitHub
SeqTR: A Simple yet Universal Network for Visual Grounding
☆144Oct 30, 2024Updated last year
RobertLuo1 / CoHD
View on GitHub
The official implementation of A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation
☆27Aug 17, 2025Updated 11 months ago
ZzZZCHS / WS-3DVG
View on GitHub
[ICCV 2023] Distilling Coarse-to-fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
☆14Oct 2, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
linxin0 / SCPGabNet
View on GitHub
☆39Nov 11, 2024Updated last year
letitiabanana / PnP-OVSS
View on GitHub
[CVPR'24] Code for Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models
☆18Jul 22, 2024Updated 2 years ago
TencentARC / TaCA
View on GitHub
Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
☆16Jun 20, 2023Updated 3 years ago
GuanqiaoDing / CNN-CIFAR10
View on GitHub
Implement and Compare VGG, ResNet and ResNeXt on CIFAR-10
☆10Mar 22, 2019Updated 7 years ago
neu-vi / Diag-HOI
View on GitHub
☆27Aug 17, 2023Updated 2 years ago
Georgelingzj / up-to-date-Vision-Language-Models
View on GitHub
Up-to-date Vision Language Models collection. Mainly focus on computer vision
☆20Feb 9, 2023Updated 3 years ago
dogehhh / ReCLIP
View on GitHub
[CVPR'24 & IJCV'25] Pytorch Implementation for ReCLIP
☆58Aug 27, 2025Updated 11 months ago
JohannesMaxWel / neural_random_forests
View on GitHub
Continuous relaxation of Random Regression Forests
☆16Mar 25, 2018Updated 8 years ago
RyanLiut / awesome-diverse-captioning
View on GitHub
Some papers about *diverse* image (a few videos) captioning
☆25Apr 4, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
paulgavrikov / vlm_shapebias
View on GitHub
Official code for "Can We Talk Models Into Seeing the World Differently?" (ICLR 2025).
☆30Jan 26, 2025Updated last year
mvrl / ConText-CIR
View on GitHub
[CVPR'25] ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval
☆16Jun 17, 2026Updated last month
agneet42 / revision
View on GitHub
[ECCV 2024] "REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models"
☆14Aug 6, 2024Updated last year
Lilidamowang / T2VIndexer-generativeSearch
View on GitHub
☆16Aug 28, 2024Updated last year
Pter61 / osrcir
View on GitHub
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval [CVPR 2025 Highlight]
☆72Jul 8, 2025Updated last year
om-ai-lab / GroundVLP
View on GitHub
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)
☆74Apr 10, 2026Updated 3 months ago
Timsty1 / FineCLIP
View on GitHub
FineCLIP: Self-distilled Region-based CLIP for Better Fine-grained Understanding (NIPS24)
☆38Nov 12, 2025Updated 8 months ago