linhuixiao/HiVG

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/linhuixiao/HiVG)

linhuixiao / HiVG

[ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.

☆65

Alternatives and similar repositories for HiVG

Users that are interested in HiVG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

linhuixiao / OneRef
View on GitHub
[NeurIPS 2024] OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling.
☆32Nov 13, 2025Updated 8 months ago
linhuixiao / CLIP-VG
View on GitHub
[TMM 2023] Self-paced Curriculum Adapting of CLIP for Visual Grounding.
☆135Nov 10, 2025Updated 8 months ago
Dmmm1997 / SimVG
View on GitHub
[NeurIPS2024] - SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion
☆103Oct 29, 2025Updated 8 months ago
chenwei746 / EEVG
View on GitHub
☆23Aug 20, 2024Updated last year
Huntersxsx / RIS-Learning-List
View on GitHub
Related papers about Referring Image Segmentation (RIS)
☆16Dec 26, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
like413 / OPT-RSVG
View on GitHub
[TGRS 2024] Language-Guided Progressive Attention for Visual Grounding in Remote Sensing Images.
☆56Jun 10, 2025Updated last year
Mr-Bigworth / MMCA
View on GitHub
Visual Grounding with Multi-modal Conditional Adaptation (ACMMM 2024 Oral)
☆26Jun 11, 2025Updated last year
jcwang0602 / PLVL
View on GitHub
Progressive Language-guided Visual Learning for Multi-Task Visual Grounding
☆13May 9, 2025Updated last year
om-ai-lab / GroundVLP
View on GitHub
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)
☆74Apr 10, 2026Updated 3 months ago
liuting20 / DARA
View on GitHub
[ICME 2024 Oral] DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding
☆22Feb 26, 2025Updated last year
Show-han / Zeroshot_REC
View on GitHub
Official code for Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions (CVPR 2024)
☆28Jun 21, 2024Updated 2 years ago
Ideal-ljl / AerialVG
View on GitHub
☆22Dec 2, 2025Updated 7 months ago
LANMNG / LQVG
View on GitHub
☆32Nov 27, 2025Updated 7 months ago
callsys / ControlCap
View on GitHub
[ECCV 2024] ControlCap: Controllable Region-level Captioning
☆81Oct 25, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
WayneTomas / TransCP
View on GitHub
[TPAMI 2024] This is the official Pytorch code for our paper "Context Disentangling and Prototype Inheriting for Robust Visual Grounding"…
☆28May 8, 2025Updated last year
LinfengYuan1997 / LoSh
View on GitHub
[CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation
☆13Jun 17, 2024Updated 2 years ago
MengyuanChen21 / ECCV2022-DELU
View on GitHub
[ECCV 2022] Dual-Evidential Learning for Weakly-supervised Temporal Action Localization
☆49Apr 19, 2024Updated 2 years ago
liuting20 / SwimVG
View on GitHub
Transactions on Multimedia (TMM25)
☆21Apr 8, 2025Updated last year
lerogo / aaai24_itr_cusa
View on GitHub
Source code of our AAAI 2024 paper "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval"
☆55Mar 28, 2024Updated 2 years ago
WeitaiKang / SegVG
View on GitHub
[ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
☆63Oct 22, 2024Updated last year
MengyuanChen21 / CVPR2023-CMPAE
View on GitHub
[CVPR 2023] Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception
☆37Jun 17, 2023Updated 3 years ago
MengyuanChen21 / CVPR2023-OWTAL
View on GitHub
[CVPR 2023] Cascade Evidential Learning for Open-world Weakly-supervised Temporal Action Localization
☆12Jul 9, 2024Updated 2 years ago
jcwang0602 / MLLMSeg
View on GitHub
MLLMSeg: Unlocking the Potential of MLLMs in Referring Expression Segmentation via a Light-weight Mask Decoder
☆57Jun 12, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
MCG-NJU / CaReBench
View on GitHub
A Fine-grained Benchmark for Video Captioning and Retrieval
☆30Jul 16, 2025Updated last year
ControlNet / NAVER
View on GitHub
[ICCV] NAVER: A Neuro-Symbolic Compositional Automaton for Visual Grounding with Explicit Logic Reasoning
☆31May 30, 2026Updated last month
VividLe / ExtractVideoFeature
View on GitHub
Extract video features. Currently, the models includes I3D, will be continuously updated.
☆12Jun 4, 2020Updated 6 years ago
cv516Buaa / OV-VG
View on GitHub
☆31Mar 25, 2024Updated 2 years ago
NMS05 / Patch-Aligned-Contrastive-Learning
View on GitHub
☆24Jul 8, 2023Updated 3 years ago
florinshen / ULAST
View on GitHub
This is the official repo of "Unsupervised Learning of Accurate Siamese Tracking"
☆19Mar 25, 2022Updated 4 years ago
callsys / DynRefer
View on GitHub
[CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution
☆59Mar 4, 2025Updated last year
tomchen-ctj / OST
View on GitHub
【CVPR'24】OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
☆39Apr 27, 2024Updated 2 years ago
kkakkkka / ETRIS
View on GitHub
[ICCV-2023] The official code of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation
☆138Jun 26, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
heshuting555 / DsHmp
View on GitHub
[CVPR-2024] Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
☆83Jul 24, 2024Updated 2 years ago
OpenGVLab / PVC
View on GitHub
[CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
☆54Jun 12, 2025Updated last year
djiajunustc / TransVG
View on GitHub
☆198Feb 27, 2024Updated 2 years ago
saibr / hypvl
View on GitHub
This repository is related to 'Intriguing Properties of Hyperbolic Embeddings in Vision-Language Models', published at TMLR (2024), https…
☆21Jul 5, 2024Updated 2 years ago
HengLan / CGSTVG
View on GitHub
[CVPR 2024] Context-Guided Spatio-Temporal Video Grounding
☆66Jun 28, 2024Updated 2 years ago
mbzuai-oryx / LongShOT
View on GitHub
A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos
☆21Jun 20, 2026Updated last month
PRIS-CV / An-Erudite-FGVC-Model
View on GitHub
Code release for Your “An Erudite Fine-Grained Visual Classification Model (CVPR 2023)"
☆17Jun 2, 2023Updated 3 years ago