[ACM MM23] CLIP-Count: Towards Text-Guided Zero-Shot Object Counting
☆123Mar 20, 2024Updated last year
Alternatives and similar repositories for CLIP-Count
Users that are interested in CLIP-Count are comparing it to the libraries listed below
Sorting:
- Includes FSC-147-D and the code for training and testing the CounTX model from the paper Open-world Text-specified Object Counting.☆41Sep 27, 2024Updated last year
- CounTR: Transformer-based Generalised Visual Counting☆123Jul 11, 2024Updated last year
- ☆17Jul 26, 2023Updated 2 years ago
- [AAAI 2024] VLCounter: Text-aware Visual Representation for Zero-Shot Object Counting☆43Nov 19, 2024Updated last year
- [CVPR 2023] CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model☆91Jul 28, 2023Updated 2 years ago
- Zero-shot Object Counting with Good Exemplars[ECCV 2024]☆27Sep 27, 2024Updated last year
- Learning to Count without Annotations☆23May 24, 2024Updated last year
- This is the official implementation of: Learning to Count Anything: Reference-less Class-agnostic Counting with Weak Supervision Michael …☆39Jul 12, 2024Updated last year
- CVPR2023 Zero-shot Counting☆60Mar 23, 2025Updated 11 months ago
- [ECCV 2022] An End-to-End Transformer Model for Crowd Localization☆109Mar 20, 2023Updated 2 years ago
- ☆430Nov 30, 2023Updated 2 years ago
- Includes the code for training and testing the CountGD model from the paper CountGD: Multi-Modal Open-World Counting.☆304Jun 25, 2025Updated 8 months ago
- ☆29Jun 10, 2024Updated last year
- The official implementation of the crowd counting model CLIP-EBC.☆92Jul 17, 2024Updated last year
- [WACV 2023] Few-shot Object Counting with Similarity-Aware Feature Enhancement☆142Oct 10, 2023Updated 2 years ago
- ☆16Sep 6, 2024Updated last year
- ☆16Nov 29, 2024Updated last year
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 2 years ago
- The codes for ACM Multimedia 2023 paper 'DAOT: Domain-Agnostically Aligned Optimal Transport for Domain-Adaptive Crowd Counting. '☆13Jan 12, 2024Updated 2 years ago
- GeckoNum Benchmark for T2I Model Eval.☆15Dec 5, 2024Updated last year
- Official PyTorch implementation of FusionCount: Efficient Crowd Counting via Multiscale Feature Fusion☆13Oct 25, 2022Updated 3 years ago
- Spatio-channel Attention Blocks for Cross-modal Crowd Counting -- Official Pytorch Implementation (ACCV'22, Oral)☆27Dec 4, 2023Updated 2 years ago
- ☆27Feb 21, 2025Updated last year
- an empirical study on few-shot counting using segment anything (SAM)☆95Apr 25, 2023Updated 2 years ago
- Official Implement of CVPR 2022 paper 'Boosting Crowd Counting via Multifaceted Attention'☆123May 10, 2025Updated 9 months ago
- [CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆55Apr 7, 2025Updated 10 months ago
- 使用ONNXRuntime部署Detic检测2万1千种类别的物体,包含C++和Python两个版本的程序☆17Aug 29, 2023Updated 2 years ago
- Few-shot Object Counting and Detection (ECCV 2022)☆83Nov 12, 2024Updated last year
- This is a repository about NWPU-MOC dataset and code.☆23Jan 24, 2024Updated 2 years ago
- PyTorch implementations of the paper: "DR.VIC: Decomposition and Reasoning for Video Individual Counting, CVPR, 2022"☆58Jun 12, 2023Updated 2 years ago
- ☆45Oct 5, 2025Updated 4 months ago
- SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models☆21Jan 11, 2024Updated 2 years ago
- ☆135Apr 2, 2024Updated last year
- official PyTorch implementation of paper "Adversarial Bipartite Graph Learning for Video Domain Adaptation" (MM2020 Oral)☆11Jun 16, 2022Updated 3 years ago
- Official PyTorch implementation of: "Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints to Better Classify Objects in V…☆14Aug 29, 2022Updated 3 years ago
- A PyTorch implementation of the paper "MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis".☆12Jan 16, 2023Updated 3 years ago
- Code Reproduction☆69Feb 28, 2022Updated 4 years ago
- S-CLIP: Semi-supervised Vision-Language Pre-training using Few Specialist Captions☆50May 26, 2023Updated 2 years ago
- [CBMI 2024 Best Paper] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".☆32May 12, 2025Updated 9 months ago