IMPlus-PCALab / GrowGrowUpLinks
Some experiences for new researchers to grow grow up
☆41Updated 2 years ago
Alternatives and similar repositories for GrowGrowUp
Users that are interested in GrowGrowUp are comparing it to the libraries listed below
Sorting:
- Official repository of the paper "High-Quality Mask Tuning Matters for Open-Vocabulary Segmentation"☆32Updated 3 months ago
- Learning 1D Causal Visual Representation with De-focus Attention Networks☆35Updated last year
- Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation☆48Updated last month
- [CVPR2025] Official implementation of the paper "Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practi…☆21Updated last week
- Offical implementation of "Re-Aligning Language to Visual Objects with an Agentic Workflow"☆21Updated 2 months ago
- [CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"☆72Updated 9 months ago
- [AAAI 2023 Oral] CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets☆36Updated 10 months ago
- [AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation☆29Updated last year
- Official implement of ICML2024 Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation☆51Updated 10 months ago
- [CVPR 2025] Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training☆47Updated 2 months ago
- ☆16Updated 6 months ago
- Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models☆85Updated 9 months ago
- ☆77Updated 7 months ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆37Updated last year
- GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding☆51Updated last month
- [IJCV 2024]☆16Updated 7 months ago
- [NeurIPS 2024 Spotlight ⭐️] Parameter-Inverted Image Pyramid Networks (PIIP)☆92Updated last month
- The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (arXiv 2025)☆30Updated 3 weeks ago
- HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model☆46Updated last month
- ☆57Updated last month
- An open source codebase for object detection based on Jittor☆18Updated 4 months ago
- ☆25Updated last year
- [CVPR 2023] Official implementation of "SAP-DETR: Bridging the Gap between Salient Points and Queries-Based Transformer Detector for Fast…☆30Updated 2 years ago
- ☆21Updated 3 months ago
- Awesome paper for multi-modal llm with grounding ability☆17Updated 10 months ago
- Latest open-source "Thinking with images" (O3/O4-mini) papers, covering training-free, SFT-based, and RL-enhanced methods for "fine-grain…☆52Updated last week
- When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning☆25Updated 2 months ago
- [CVPR'2022, TPAMI'2024] LAVT: Language-Aware Vision Transformer for Referring Segmentation☆20Updated 5 months ago
- ☆11Updated 5 months ago
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆54Updated 7 months ago