songw-zju / PixelThinkLinks
The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (arXiv 2025)
☆21Updated last week
Alternatives and similar repositories for PixelThink
Users that are interested in PixelThink are comparing it to the libraries listed below
Sorting:
- officical code for ECCV 2024 paper "Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary Detection"☆14Updated 11 months ago
- ☆59Updated 2 weeks ago
- [NIPS24] Official Implementation of Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation☆18Updated 7 months ago
- Learning 1D Causal Visual Representation with De-focus Attention Networks☆34Updated last year
- Official code of DMA: Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding, ECCV 2024☆29Updated 10 months ago
- Official PyTorch codes for "Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation", ECCV2024☆29Updated 10 months ago
- [CVPR 2025] Official PyTorch Implementation of GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmenta…☆39Updated last month
- [CVPR 2025] Test-Time Visual In-Context Tuning☆23Updated 2 months ago
- [TCSVT] state-of-the-art open vocabulary detector on COCO/LVIS/V3Det☆29Updated last year
- The offical implemention of JM3D.☆30Updated last month
- Official implementation of "Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness".☆25Updated 2 months ago
- Can 3D Vision-Language Models Truly Understand Natural Language?☆21Updated last year
- [ICLR 2025] Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception☆10Updated last month
- [ECCV 2024] R2-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations☆10Updated 10 months ago
- Official PyTorch implementation of paper “InsViE-1M: Effective Instruction-based Video Editing with Elaborate Dataset Construction”☆16Updated 3 weeks ago
- [CVPR'25] Official implementation of "Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation"☆27Updated last month
- [NeurIPS'24] Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation (Diffews)☆37Updated last month
- [CVPR 2025 highlight] v-CLR: View-Consistent Learning for Open-World Instance Segmentation☆18Updated 2 months ago
- ☆43Updated 8 months ago
- [IJCV 2024]☆16Updated 6 months ago
- [AAAI 2024] The official implementation of the paper "3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Refer…☆42Updated last year
- ☆30Updated 4 months ago
- (ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation☆47Updated 10 months ago
- This is the project for 'USG'.☆16Updated 2 months ago
- [CVPR 25] A framework named B^2-DiffuRL for RL-based diffusion model fine-tuning.☆29Updated 2 months ago
- Sambor: Boosting Segment Anything Model Towards Open-Vocabulary Learning☆30Updated last year
- (ICCV 2023) MasQCLIP for Open-Vocabulary Universal Image Segmentation☆38Updated last year
- This repository is dedicated to Track 2 of the W-CODA 2024 Workshop, "Multimodal Perception and Comprehension of Corner Cases in Autonomo…☆11Updated 11 months ago
- Official Code for 'TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction'☆57Updated 5 months ago
- The official repository for paper "MLLMs Need 3D-Aware Representation Supervision for Scene Understanding"☆30Updated this week