kingthreestones/RefCLIP

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kingthreestones/RefCLIP)

kingthreestones / RefCLIP

☆39

Alternatives and similar repositories for RefCLIP

Users that are interested in RefCLIP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

luogen1996 / SimREC
View on GitHub
A lightweight codebase for referring expression comprehension and segmentation
☆57May 21, 2022Updated 4 years ago
Show-han / Zeroshot_REC
View on GitHub
Official code for Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions (CVPR 2024)
☆28Jun 21, 2024Updated 2 years ago
zjh31 / CPL
View on GitHub
☆21Apr 2, 2024Updated 2 years ago
uvavision / AMC-grounding
View on GitHub
[CVPR 2023] Code for "Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations"
☆19Oct 10, 2023Updated 2 years ago
Mr-Neko / JM3D
View on GitHub
The offical implemention of JM3D.
☆31Apr 8, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
mengcaopku / DCNet
View on GitHub
[ACM MM 22] Correspondence Matters for Video Referring Expression Comprehension
☆15Sep 4, 2022Updated 3 years ago
ChunmingHe / Camouflageator
View on GitHub
☆30Dec 2, 2024Updated last year
ZzZZCHS / WS-3DVG
View on GitHub
[ICCV 2023] Distilling Coarse-to-fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
☆14Oct 2, 2024Updated last year
liuting20 / DARA
View on GitHub
[ICME 2024 Oral] DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding
☆22Feb 26, 2025Updated last year
qinzzz / Multimodal-Alignment-Framework
View on GitHub
Implementation for MAF: Multimodal Alignment Framework
☆46Nov 25, 2020Updated 5 years ago
ChunmingHe / WS-SAM
View on GitHub
☆52Mar 25, 2024Updated 2 years ago
linhuixiao / CLIP-VG
View on GitHub
[TMM 2023] Self-paced Curriculum Adapting of CLIP for Visual Grounding.
☆135Nov 10, 2025Updated 8 months ago
youngfly11 / ReIR-WeaklyGrounding.pytorch
View on GitHub
The official PyTorch code for "Relation-aware Instance Refinement for Weakly Supervised Visual Grounding" accepted by CVPR2021
☆28Oct 9, 2021Updated 4 years ago
narthchin / DEIQT
View on GitHub
Checkpoints, logs and source code for AAAI-23 paper 'Data-Efficient Image Quality Assessment with Attention-Panel Decoder'
☆39Apr 3, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
leotam / MIMIC-CXR-annotations
View on GitHub
☆15Aug 4, 2020Updated 5 years ago
uvavision / SelfEQ
View on GitHub
[CVPR 2024] Code for "Improved Visual Grounding through Self-Consistent Explanations".
☆28Mar 1, 2024Updated 2 years ago
insomnia94 / DTWREG
View on GitHub
Preliminary code for reviewers
☆13Mar 30, 2021Updated 5 years ago
DengPingFan / CSU
View on GitHub
Concealed Scene Understanding, Visual Intelligence (VI), 2023
☆71Aug 15, 2025Updated 11 months ago
zhjohnchan / SK-VG
View on GitHub
[CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.
☆34Jul 12, 2023Updated 3 years ago
Disguiser15 / RefTeacher
View on GitHub
RefTeacher is a strong baseline method for Semi-Supervised Referring Expression Comprehension.
☆14May 26, 2023Updated 3 years ago
shikras / d-cube
View on GitHub
A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating…
☆138Mar 20, 2024Updated 2 years ago
cskyl / SAM_WSSS
View on GitHub
SAM Enhance Mask Quality for WSSS: This repository provides tools for generating, evaluating, and visualizing enhanced pseudo masks for W…
☆75Oct 9, 2023Updated 2 years ago
wangpengnorman / KB-Ref_dataset
View on GitHub
☆16Dec 28, 2020Updated 5 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
jinxiang-liu / UFE-AVS
View on GitHub
Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""
☆19Jul 7, 2024Updated 2 years ago
NaturalKnight / SLViT
View on GitHub
Code for IJCAI 2023 paper 'SLViT: Scale-Wise Language-Guided Vision Transformer for Referring Image Segmentation'
☆11May 28, 2023Updated 3 years ago
allenai / reclip
View on GitHub
☆92Apr 15, 2022Updated 4 years ago
PVIT-official / PVIT
View on GitHub
Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models
☆37Sep 19, 2023Updated 2 years ago
jianzongwu / betrayed-by-captions
View on GitHub
(ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation
☆48Jul 18, 2024Updated 2 years ago
antoyang / TubeDETR
View on GitHub
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
☆194Sep 24, 2023Updated 2 years ago
CASIA-IVA-Lab / SC-Tune
View on GitHub
Official code for CVPR 2024 paper, "SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models"
☆16Apr 22, 2024Updated 2 years ago
DavidYan2001 / PVChat
View on GitHub
[ICCV 2025] PVChat: Personalized Video Chat with One-Shot Learning
☆17Apr 4, 2026Updated 3 months ago
mmaaz60 / mdef_detr
View on GitHub
☆11May 9, 2023Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
vvvb-github / AVSegFormer
View on GitHub
[AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer
☆74Mar 6, 2025Updated last year
Plusero / wur_life_lore
View on GitHub
☆12Jun 4, 2026Updated last month
seanzhuh / SeqTR
View on GitHub
SeqTR: A Simple yet Universal Network for Visual Grounding
☆144Oct 30, 2024Updated last year
Yangr116 / BoxSnake
View on GitHub
[ICCV 2023] BoxSnake official repository.
☆66May 28, 2024Updated 2 years ago
csitfun / ConTRoL-dataset
View on GitHub
Dataset for AAAI paper "Natural Language Inference in Context - Investigating Contextual Reasoning over Long Texts"
☆11Nov 18, 2022Updated 3 years ago
rolsheng / MM-VUFM4DS
View on GitHub
【IEEE T-IV】A systematic survey of multi-modal and multi-task visual understanding foundation models for driving scenarios
☆49May 26, 2024Updated 2 years ago
mightyzau / InfMLLM
View on GitHub
☆19Dec 6, 2023Updated 2 years ago