Hanzy1996/OpenSeg-R

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Hanzy1996/OpenSeg-R)

Hanzy1996 / OpenSeg-R

OpenSeg-R: Improving Open-Vocabulary Segmentation via Step-by-Step Visual Reasoning

☆29

Alternatives and similar repositories for OpenSeg-R

Users that are interested in OpenSeg-R are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jasongief / TGS-Agent
View on GitHub
[2026 AAAI] Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation
☆20Nov 8, 2025Updated 8 months ago
Amazingren / MIRAGE
View on GitHub
(ICLR2026) Efficient Degradation-agnostic Image Restoration via Channel-Wise Functional Decomposition and Manifold Regularization
☆35Jun 23, 2026Updated 3 weeks ago
mbzuai-oryx / Video-R2
View on GitHub
Video-R2: Reinforcing Consistent and Grounded Reasoning in Multimodal Language Models
☆19Jan 21, 2026Updated 6 months ago
jasongief / Mettle
View on GitHub
[2025 TPAMI] Mettle: Meta-Token Learning for Memory-Efficient Audio-Visual Adaptation
☆17Jan 3, 2026Updated 6 months ago
SalesforceAIResearch / strefer
View on GitHub
Strefer: Empowering Video LLMs with Space-Time Referring and Reasoning via Synthetic Instruction Data
☆19Jun 2, 2026Updated last month
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
mbzuai-oryx / LongShOT
View on GitHub
A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos
☆21Jun 20, 2026Updated last month
jasongief / LEAP
View on GitHub
[2024 ECCV] Label-anticipated Event Disentanglement for Audio-Visual Video Parsing
☆14Nov 17, 2024Updated last year
lxa9867 / QSD
View on GitHub
[CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"
☆12Feb 27, 2024Updated 2 years ago
hustvl / MaskAdapter
View on GitHub
[CVPR 2025] Official repository of the paper "Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation"
☆135Oct 23, 2025Updated 8 months ago
yasserben / FLOSS
View on GitHub
[ICCV 2025] FLOSS: Plug-in Training-free and label-free text template selection that boosts OVSS methods
☆35Feb 13, 2026Updated 5 months ago
zwq456 / CLIP-VIS
View on GitHub
[IEEE TCSVT] Official Pytorch Implementation of CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation.
☆48Jan 7, 2025Updated last year
mbzuai-oryx / ARB
View on GitHub
ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark
☆17May 25, 2025Updated last year
aminebdj / 3D-OWIS
View on GitHub
[NeurIPS2023] 3D-OWIS is capable of detecting unknown instances in inference, and progressively learning novel classes in the process of …
☆68Dec 3, 2023Updated 2 years ago
jinbae-s / ACVIS
View on GitHub
[ICASSP 2026] The official pytorch implementation of ACVIS
☆15Jan 19, 2026Updated 6 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
yayafengzi / ALToLLM
View on GitHub
ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation
☆30May 27, 2025Updated last year
wang-chaoyang / RefLDMSeg
View on GitHub
[AAAI 2025] Explore In-Context Segmentation via Latent Diffusion Models
☆22Mar 25, 2025Updated last year
SuleBai / SC-CLIP
View on GitHub
[TIP 2025] Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
☆72Mar 27, 2026Updated 3 months ago
GeWu-Lab / TSPM
View on GitHub
Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.
☆17Oct 25, 2024Updated last year
yingchengy / AVMOE
View on GitHub
[NeurIPS 2024] Mixture of Experts for Audio-Visual Learning
☆25Jan 19, 2025Updated last year
solicucu / D3G
View on GitHub
☆15Oct 30, 2023Updated 2 years ago
yongliu20 / SCAN
View on GitHub
[CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"
☆77Sep 23, 2024Updated last year
jasongief / OV-AVEL
View on GitHub
[2025 CVPR] Towards Open-Vocabulary Audio-Visual Event Localization
☆46Mar 7, 2025Updated last year
zjucsq / PLA
View on GitHub
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆12Sep 17, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
GeWu-Lab / LFAV
View on GitHub
Towards Long Form Audio-visual Video Understanding
☆15Jan 16, 2026Updated 6 months ago
xiaomoguhz / DeCLIP
View on GitHub
[CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
☆169Jan 10, 2026Updated 6 months ago
OpenMask3D / openmask3d.github.io
View on GitHub
☆11May 8, 2024Updated 2 years ago
xb534 / SED
View on GitHub
[TPAMI2025&CVPR2024] Official Pytorch Implementation of SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation.
☆199May 30, 2024Updated 2 years ago
schowdhury671 / meerkat
View on GitHub
☆35Jul 9, 2025Updated last year
chunmeifeng / DiffTPT
View on GitHub
【ICCV 2023】Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning & 【IJCV 2025】Diffusion-Enhanced Test-time Adap…
☆72Jan 15, 2025Updated last year
Franklin905 / VALOR
View on GitHub
Research code for NeurIPS 2023 paper "Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser"
☆17Jul 13, 2025Updated last year
RobertLuo1 / CoHD
View on GitHub
The official implementation of A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation
☆27Aug 17, 2025Updated 11 months ago
HiLab-git / SicTTA
View on GitHub
SicTTA: Single Image Continual Test-Time Adaptation for Medical Image Segmentation
☆18Dec 21, 2025Updated 7 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
kumuji / Sa2VA-i
View on GitHub
Sa2VA-i is an improved version of the popular Sa2VA model
☆16Nov 25, 2025Updated 7 months ago
yasserben / CLOUDS
View on GitHub
[CVPR 2024] Official Implementation of Collaborating Foundation models for Domain Generalized Semantic Segmentation
☆77Apr 4, 2025Updated last year
AVC2-UESTC / OV2VSS
View on GitHub
Official Implementation of Towards Open Vocabulary Video Semantic Segmentation
☆14Feb 27, 2025Updated last year
lisat-bair / LISAt_code
View on GitHub
☆30Sep 2, 2025Updated 10 months ago
kaka0910 / MFogHub
View on GitHub
Data and code release for "MFogHub" (CVPR 2025).
☆19Jan 29, 2026Updated 5 months ago
PolyU-ChenLab / UniPixel
View on GitHub
🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)
☆247Jan 4, 2026Updated 6 months ago
val-iisc / VL2V-ADiP
View on GitHub
[CVPR 2024] Leveraging Vision-Language Models for Improving Domain Generalization in Image Classification
☆43Mar 6, 2024Updated 2 years ago