code for the paper "CoReS: Orchestrating the Dance of Reasoning and Segmentation"
☆22Nov 24, 2025Updated 3 months ago
Alternatives and similar repositories for CoReS
Users that are interested in CoReS are comparing it to the libraries listed below
Sorting:
- Official implementation for "Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts"☆22Jun 28, 2025Updated 8 months ago
- Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision☆42Oct 19, 2025Updated 4 months ago
- [CVPR25] CoLLM: A Large Language Model for Composed Image Retrieval☆28Mar 26, 2025Updated 11 months ago
- Segment Anything with Deictic Prompting☆27May 13, 2025Updated 9 months ago
- Video Reasoning Segmentation☆28Nov 29, 2024Updated last year
- Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning☆41Updated this week
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆205Aug 5, 2024Updated last year
- [CVPR 2025 Highlight] Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding☆62Aug 31, 2025Updated 6 months ago
- A collection of awesome think with videos papers.☆91Dec 1, 2025Updated 3 months ago
- [ECCV 2024] SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation,☆49Mar 20, 2025Updated 11 months ago
- ☆14Aug 28, 2024Updated last year
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- [AAAI 2024] PoseGen: Learning to Generate 3D Human Pose Datasets with NeRF☆10Dec 29, 2023Updated 2 years ago
- This project is a demonstration of a content-based recommendation system for Spotify that leverages user's preferences and audio features…☆17Apr 4, 2023Updated 2 years ago
- [ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"☆53Feb 10, 2025Updated last year
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆108May 29, 2025Updated 9 months ago
- The official implementation of the paper "Large Scale Knowledge Washing"☆10Jun 12, 2024Updated last year
- [AAAI 2026] Official Code for VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning☆19Nov 28, 2025Updated 3 months ago
- [CVPR 2025] Official implementation of SSP: High Temporal Consistency through Semantic Similarity Propagation in Semi-Supervised Video Se…☆15Jun 26, 2025Updated 8 months ago
- LongCTR: A Long Sequence Modeling Benchmark for CTR Prediction☆17Jun 21, 2025Updated 8 months ago
- ☆10Mar 31, 2025Updated 11 months ago
- Source codes for the paper "Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning" (PDMER) which p…☆13Mar 24, 2025Updated 11 months ago
- ☆11Jul 2, 2022Updated 3 years ago
- Technical Challenge Repository for Visual Anomaly Detection Workshop (VAND) at CVPR☆13Jul 21, 2025Updated 7 months ago
- Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)☆51Feb 4, 2026Updated last month
- [CVPR 2023] "TrojViT: Trojan Insertion in Vision Transformers" by Mengxin Zheng, Qian Lou, Lei Jiang☆14Jan 5, 2024Updated 2 years ago
- [ECCV 2024] The first zero-shot setting for spatio-temporal video grounding.☆11Jul 16, 2024Updated last year
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆47Jun 16, 2024Updated last year
- Official implementation of ResCLIP: Residual Attention for Training-free Dense Vision-language Inference☆63Oct 27, 2025Updated 4 months ago
- Dettoolchain: A new prompting paradigm to unleash detection ability of MLLM☆45Oct 12, 2024Updated last year
- UGround: Towards Unified Visual Grounding with Unrolled Transformers☆21Feb 15, 2026Updated 3 weeks ago
- RESAnything: Attribute Prompting for Arbitrary Referring Segmentation☆17Nov 28, 2025Updated 3 months ago
- [ICCV 2025 Highlight] LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs☆20Nov 16, 2025Updated 3 months ago
- ☆12Jul 12, 2024Updated last year
- [CVPR 2025] Official Repository of the paper "On the Consistency of Video Large Language Models in Temporal Comprehension"☆16Oct 13, 2025Updated 4 months ago
- The official implementation of Bayesian Cross-modal Alignment Learning for Few-Shot Out-of-Distribution Generalization (AAAI2023).☆20Oct 13, 2025Updated 4 months ago
- Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation[TNNLS2024]☆13May 6, 2025Updated 10 months ago
- Collections of papers and code for employing MLLM for quality assessment tasks.☆13Apr 18, 2024Updated last year
- [AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning☆106Dec 3, 2025Updated 3 months ago