liuting20/SwimVG

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/liuting20/SwimVG)

liuting20 / SwimVG

Transactions on Multimedia (TMM25)

☆21

Alternatives and similar repositories for SwimVG

Users that are interested in SwimVG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

liuting20 / MaPPER
View on GitHub
[EMNLP 2024 Main] MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
☆16Jan 6, 2025Updated last year
liuting20 / DARA
View on GitHub
[ICME 2024 Oral] DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding
☆22Feb 26, 2025Updated last year
liuting20 / MustDrop
View on GitHub
Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model
☆36Jan 8, 2025Updated last year
liuting20 / Sparse-Tuning
View on GitHub
☆30Jun 29, 2026Updated 3 weeks ago
jiaqihuang01 / DETRIS
View on GitHub
[AAAI-2025] The official code of Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation
☆74May 21, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
williamium3000 / awesome-mllm-grounding
View on GitHub
Awesome paper for multi-modal llm with grounding ability
☆21Oct 11, 2025Updated 9 months ago
anakin-skywalker-Joseph / Folder
View on GitHub
Official Implementation of Paper FOLDER (ICCV2025) and Turbo (ECCV2024)
☆15Jun 27, 2025Updated last year
GXNU-ZhongLab / RSTrack
View on GitHub
Explicit Context Reasoning with Supervision for Visual Tracking (ACM MM 25)
☆18Jul 20, 2025Updated last year
ziplab / MPVSS
View on GitHub
☆33Feb 29, 2024Updated 2 years ago
kkakkkka / ETRIS
View on GitHub
[ICCV-2023] The official code of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation
☆138Jun 26, 2025Updated last year
GeWu-Lab / Ref-AVS
View on GitHub
The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024
☆50Oct 12, 2025Updated 9 months ago
GXNU-ZhongLab / EVPTrack
View on GitHub
☆29Apr 3, 2024Updated 2 years ago
jcwang0602 / VPTracker
View on GitHub
VPTracker: Global Vision-Language Tracking via Visual Prompt and MLLM
☆16Mar 10, 2026Updated 4 months ago
yahooo-m / VOS-Solution
View on GitHub
ECCV 2024 STMA & CVPR 2024 1st MOSE & 1st VOT Challenge & 1st LSVOS v6
☆12Oct 16, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Run542968 / GAP
View on GitHub
☆11Oct 13, 2024Updated last year
path2generalist / General-Level
View on GitHub
On Path to Multimodal Generalist: General-Level and General-Bench
☆21Jul 11, 2025Updated last year
AI-HPC-Research-Team / GW_PE_prior_sampling
View on GitHub
Prior Sampling for high dimension data with domain knowledge.
☆10Jan 11, 2022Updated 4 years ago
TencentARC / MindOmni
View on GitHub
[NeurIPS2025] The official implementation of MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
☆139Oct 15, 2025Updated 9 months ago
kkakkkka / MambaTalk
View on GitHub
[NeurlPS-2024] The official code of MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models
☆84Jan 9, 2026Updated 6 months ago
Dmmm1997 / SimVG
View on GitHub
[NeurIPS2024] - SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion
☆103Oct 29, 2025Updated 8 months ago
lllyasviel / google_blockly_prototypes
View on GitHub
☆13Oct 31, 2024Updated last year
kkakkkka / Webot-AutoDrive-MazeBot
View on GitHub
[C++] Automatic Navigation Car Based on the Webots Environment
☆22May 24, 2025Updated last year
SHI-Labs / Slow-Fast-Video-Multimodal-LLM
View on GitHub
☆29Apr 8, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
GXNU-ZhongLab / AQATrack
View on GitHub
CVPR24
☆71Aug 4, 2024Updated last year
lllyasviel / forge-legacy-extensions
View on GitHub
some archived legacy forge extensions
☆15Jul 26, 2024Updated last year
Coo1Sea / OVT-B-Dataset
View on GitHub
[NeurIPS 2024] Repository for the paper "OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking".
☆28Nov 9, 2024Updated last year
siyuanliii / SLAck
View on GitHub
Official Implementation of ECCV2024 paper: SLAck
☆29Sep 18, 2024Updated last year
Zhuo-Cao / FlashVTG
View on GitHub
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding. (WACV2025)
☆39Apr 17, 2025Updated last year
NJU-PCALab / STTrack
View on GitHub
[AAAI 2025] Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking
☆118May 18, 2025Updated last year
HL-hanlin / Bifrost-1
View on GitHub
Official implementation of Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (NeurIPS 2025)
☆47Nov 24, 2025Updated 7 months ago
JieHu1996 / DeformableMamba
View on GitHub
☆24Jun 17, 2025Updated last year
Seaz9 / PiDiViT
View on GitHub
When Pixel Difference Patterns Meet ViT: PiDiViT for Few-Shot Object Detection
☆19Nov 3, 2025Updated 8 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
NikoGuan / PreP-OCR
View on GitHub
☆28Nov 29, 2025Updated 7 months ago
zhengxuJosh / AnySeg
View on GitHub
Code & Weights for “Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation”
☆15Dec 6, 2024Updated last year
areyouok / simple-rpc
View on GitHub
☆12Aug 7, 2022Updated 3 years ago
jasongief / LEAP
View on GitHub
[2024 ECCV] Label-anticipated Event Disentanglement for Audio-Visual Video Parsing
☆14Nov 17, 2024Updated last year
Dmmm1997 / MomentSeg
View on GitHub
[ECCV2026] MomentSeg: Moment-Centric Sampling for Enhanced Video Pixel Understanding
☆24Jun 19, 2026Updated last month
linhuixiao / HiVG
View on GitHub
[ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.
☆65Nov 10, 2025Updated 8 months ago
DUT-CSJ / PVUW2023-VSS-3rd
View on GitHub
☆10Jun 13, 2023Updated 3 years ago