baoxiaoyi/CoReS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/baoxiaoyi/CoReS)

baoxiaoyi / CoReS

code for the paper "CoReS: Orchestrating the Dance of Reasoning and Segmentation"

☆23

Alternatives and similar repositories for CoReS

Users that are interested in CoReS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rui-qian / READ
View on GitHub
Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)
☆54Feb 4, 2026Updated 5 months ago
berkeley-hipie / segllm
View on GitHub
Code release for "SegLLM: Multi-round Reasoning Segmentation"
☆129Feb 20, 2025Updated last year
ysj9909 / StAR
View on GitHub
[ECCV 2026] StAR: Segment Anything Reasoner
☆25Apr 2, 2026Updated 3 months ago
DanielSHKao / ThinkFirst
View on GitHub
Official implementation for "Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts"
☆22Jun 28, 2025Updated last year
Shengcao-Cao / groundLMM
View on GitHub
Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision
☆47Oct 19, 2025Updated 9 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
congvvc / InstructSeg
View on GitHub
[ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"
☆56Feb 10, 2025Updated last year
nnnth / UFO
View on GitHub
[NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…
☆281Nov 5, 2025Updated 8 months ago
rkzheng99 / ViLLa
View on GitHub
Video Reasoning Segmentation
☆26Nov 29, 2024Updated last year
felixcheng97 / AGAP
View on GitHub
[3DV 2025] Learning Naturally Aggregated Appearance for Efficient 3D Editing
☆33Feb 13, 2025Updated last year
hmchuong / CoLLM
View on GitHub
[CVPR25] CoLLM: A Large Language Model for Composed Image Retrieval
☆28Mar 26, 2025Updated last year
letitiabanana / PnP-OVSS
View on GitHub
[CVPR'24] Code for Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models
☆18Jul 22, 2024Updated 2 years ago
see-say-segment / sesame
View on GitHub
🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"
☆47Jun 16, 2024Updated 2 years ago
GigaAI-research / PhysClaw
View on GitHub
PhysClaw*: Physical Continual Learning Agent Workflow
☆42Mar 17, 2026Updated 4 months ago
cilinyan / VISA
View on GitHub
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆214Aug 5, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
songw-zju / PointLoRA
View on GitHub
The official implementation of "PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning" (CVPR 2025)
☆29Oct 31, 2025Updated 8 months ago
V-STaR-Bench / V-STaR
View on GitHub
Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
☆45Mar 2, 2026Updated 4 months ago
LuFan31 / CompreCap
View on GitHub
CVPR2025: Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning
☆39Mar 21, 2025Updated last year
alibaba-damo-academy / VL-Cogito
View on GitHub
☆24Nov 4, 2025Updated 8 months ago
JIA-Lab-research / Seg-Zero
View on GitHub
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆636Jan 17, 2026Updated 6 months ago
LiYinqi / un2CLIP
View on GitHub
[NeurIPS'25] A work to improve CLIP's visual detail capturing ability by inverting the unCLIP generative model.
☆26Mar 19, 2026Updated 4 months ago
wuw2019 / LoTLIP
View on GitHub
[NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
☆49Jan 14, 2025Updated last year
QuentinFitteRey / VLMSAM
View on GitHub
Qwen-SAM is a reasoning-based segmentation model that integrates Qwen 2.5 VL 7B with the Segment Anything Model (SAM), enabling fine-grai…
☆32Jun 4, 2025Updated last year
yvhangyang / ResCLIP
View on GitHub
Official implementation of ResCLIP: Residual Attention for Training-free Dense Vision-language Inference
☆68Updated this week
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
JerryXu0129 / HyP2-Loss
View on GitHub
☆14Oct 10, 2022Updated 3 years ago
GigaAI-research / WonderFree
View on GitHub
☆19Jun 26, 2025Updated last year
visiontao / dcnet
View on GitHub
Effective Abstract Reasoning with Dual-Contrast Network
☆11Jun 25, 2021Updated 5 years ago
RenMin1991 / cleaned-DukeMTMC-reID
View on GitHub
Cleaned test data list of DukeMTMC-reID, ICCV2021
☆15Aug 26, 2021Updated 4 years ago
ml-research / deictic-segment-anything
View on GitHub
Segment Anything with Deictic Prompting
☆27May 13, 2025Updated last year
yfChang-cv / FVQ
View on GitHub
Official Implementation of Paper: FVQ: Scalable Training for Vector-Quantized Networks with 100% Codebook Utilization (ICLR2026)
☆26Jan 30, 2026Updated 6 months ago
GigaAI-research / EmbodieDreamer
View on GitHub
☆33Jul 8, 2025Updated last year
seilk / LocalizationHeads
View on GitHub
[CVPR 2025 Highlight] Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding
☆81Aug 31, 2025Updated 10 months ago
Hi-MrChen / 3d-human-reconstruction
View on GitHub
☆24Jun 2, 2022Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
MiliLab / Text-Before-Vision
View on GitHub
[ICML 2026] Text Before Vision: Staged Knowledge Injection Matters for Agentic RLVR in Ultra-High-Resolution Remote Sensing Understanding
☆16Mar 13, 2026Updated 4 months ago
aim-uofa / SegAgent
View on GitHub
[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
☆108Aug 8, 2025Updated 11 months ago
GigaAI-research / HumanDreamer-X
View on GitHub
☆23Jul 11, 2025Updated last year
GigaAI-research / GigaVideo-1
View on GitHub
☆17Jun 13, 2025Updated last year
JIA-Lab-research / VisionReasoner
View on GitHub
[ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning
☆348Feb 9, 2026Updated 5 months ago
zhengye1995 / DCIC22-Cow
View on GitHub
DCIC22数字中国22-牛只图像分割竞赛第四名方案
☆14Jul 18, 2022Updated 4 years ago
BBQtime / deformable-convolution-network-DCN-for-head-and-neck-tumor-segmentation
View on GitHub
3D deformable convolution network(DCN) for head and neck tumor segmentation
☆11May 4, 2023Updated 3 years ago