yeyimilk / CrowdVLM-R1Links
Proposed fuzzy reward model with GRPO to improve VLM's abilities in crowd counting task.
☆21Updated 9 months ago
Alternatives and similar repositories for CrowdVLM-R1
Users that are interested in CrowdVLM-R1 are comparing it to the libraries listed below
Sorting:
- Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)☆22Updated 4 years ago
- Dataset Diffusion: Diffusion-based Synthetic Data Generation for Pixel-Level Semantic Segmentation (NeurIPS2023)☆128Updated last year
- A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting [ECCV 2024]☆103Updated 2 years ago
- DiverGen (CVPR 2024) & BSGAL (ICML 2024)☆53Updated 7 months ago
- This is code of paper "ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer"☆26Updated 2 years ago
- ☆63Updated 2 years ago
- [NeurIPS 2023] FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models☆131Updated 2 years ago
- ☆138Updated last year
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆57Updated 2 years ago
- [CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution☆58Updated 11 months ago
- [NeurIPS 2024] official code release for our paper "Revisiting the Integration of Convolution and Attention for Vision Backbone".☆42Updated last year
- Official Implementation for CVPR 2024 paper: CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor☆110Updated last year
- [NeurIPS'24] A Simple Image Segmentation Framework via In-Context Examples☆65Updated last year
- [ECCV 2024] SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation,☆49Updated 10 months ago
- The official codes and datasets for Artistic Text Segmentation (ECCV 2024).☆27Updated 4 months ago
- AI-SAM: Automatic and Interactive Segment Anything Model☆21Updated 11 months ago
- ☆201Updated 8 months ago
- [CVPR 2024 Highlight] Official GraCo: Granularity-Controllable Interactive Segmentation.☆61Updated 10 months ago
- [ICCV 2025] HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasets☆62Updated 6 months ago
- [NeurIPS-W 2025] Official Implementation of "Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning"☆58Updated 7 months ago
- Recognize Any Regions☆123Updated last year
- [ICCV 2023] CTVIS: Consistent Training for Online Video Instance Segmentation☆80Updated 2 years ago
- [CVPR 2023] implementation of Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information.☆91Updated 2 years ago
- [ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference☆97Updated 10 months ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆137Updated 9 months ago
- ☆32Updated last year
- This is the repository for paper "UniQA: Unified Vision-Language Pre-training of Quality and Aesthetics"☆28Updated 10 months ago
- [CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"☆211Updated last year
- Official PyTorch implementation of ResFormer: Scaling ViTs with Multi-Resolution Training, CVPR2023☆30Updated 2 years ago
- [AAAI 2025] Official Implementation of "FOCUS: Towards Universal Foreground Segmentation"☆55Updated 7 months ago