☆29Jun 10, 2024Updated last year
Alternatives and similar repositories for CountCLIP
Users that are interested in CountCLIP are comparing it to the libraries listed below
Sorting:
- Official implemention of "Make It Count: Text-to-Image Generation with an Accurate Number of Objects" (CVPR 2025)☆97Mar 12, 2025Updated last year
- [ECCV 2024] Teach CLIP to Develop a Number Sense for Ordinal Regression☆19Apr 1, 2025Updated 11 months ago
- ☆15Feb 24, 2023Updated 3 years ago
- Official codebase for the NeurIPS 2023 paper: Towards Last-layer Retraining for Group Robustness with Fewer Annotations. https://arxiv.or…☆12May 15, 2024Updated last year
- Ranking-Consistent Language-Image Pretraining☆12Oct 24, 2025Updated 4 months ago
- Code for "CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally"☆25Feb 27, 2026Updated 3 weeks ago
- [ICCV 2023] Going Beyond Nouns With Vision & Language Models Using Synthetic Data☆13Sep 30, 2023Updated 2 years ago
- CVPR 2024 Official Repository☆12Mar 27, 2024Updated last year
- GeckoNum Benchmark for T2I Model Eval.☆15Dec 5, 2024Updated last year
- [NeurIPS 2024] Official implementation of "Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance"☆17Dec 4, 2024Updated last year
- If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions☆17Apr 4, 2024Updated last year
- [ECCV 2024] Official code for "Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation"☆18Jul 31, 2025Updated 7 months ago
- ☆17Aug 8, 2024Updated last year
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆22Nov 8, 2023Updated 2 years ago
- Code to reproduce the experiments in the paper: Does CLIP Bind Concepts? Probing Compositionality in Large Image Models.☆16Oct 14, 2023Updated 2 years ago
- ☆11Sep 15, 2023Updated 2 years ago
- ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation☆27May 27, 2025Updated 9 months ago
- Official repository for the paper "Instance-Wise Holistic Order Prediction in Natural Scenes".☆27Jan 11, 2024Updated 2 years ago
- Code for the CCE algorithm proposed in "Towards Compositionality in Concept Learning" at ICML 2024.☆16Jun 2, 2024Updated last year
- [ACM MM23] CLIP-Count: Towards Text-Guided Zero-Shot Object Counting☆123Mar 20, 2024Updated 2 years ago
- Official Implementation of ISR-DPO:Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO (AAAI'25)☆23Nov 25, 2025Updated 3 months ago
- [IJCV 2026] HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts☆26Feb 28, 2025Updated last year
- [CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆56Apr 7, 2025Updated 11 months ago
- Official repo for [CVPR 2026] "SARMAE: Masked Autoencoder for SAR Representation Learning"☆32Dec 19, 2025Updated 3 months ago
- This is the official repository of the paper "SAR-TEXT: A Large-Scale SAR Image-Text Dataset Built with SAR-Narrator and Progressive Tran…☆29Oct 22, 2025Updated 5 months ago
- Detail-Oriented CLIP for Fine-Grained Tasks (ICLR SSI-FM 2025)☆57Mar 26, 2025Updated 11 months ago
- ☆24Sep 12, 2023Updated 2 years ago
- Mitigating Open-Vocabulary Caption Hallucinations (EMNLP 2024)☆18Oct 18, 2024Updated last year
- [T-PAMI 2023] Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection☆39Aug 29, 2023Updated 2 years ago
- The code for paper entitled "Data-Driven Modulation Optimization with LMMSE Equalization for Reliability Enhancement in Underwater Acoust…☆19Oct 4, 2025Updated 5 months ago
- PyTorch implementation of ``Masked-Attention Diffusion Guidance for Spatially Controlling Text-to-Image Generation'' [The Visual Computer…☆25Jan 7, 2025Updated last year
- [NeurIPS 2024 Spotlight] code for "Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement"☆19Jan 26, 2025Updated last year
- ☆39May 20, 2025Updated 10 months ago
- Repository to perform multi animal pose detection. In particular this code is used for bee pose estimation.☆10Jan 10, 2022Updated 4 years ago
- ☆20May 3, 2025Updated 10 months ago
- Official Code Repo for the paper "Learning to Play Atari in a World of Tokens" accepted at ICML, 2024☆11Jun 6, 2024Updated last year
- Evaluation and dataset construction code for the CVPR 2025 paper "Vision-Language Models Do Not Understand Negation"☆46Feb 26, 2026Updated 3 weeks ago
- Implementation of "Conditional Score Guidance for Text-Driven Image-to-Image Translation" (NeurIPS 2023).☆11Jul 19, 2023Updated 2 years ago
- This repo consists of my implementation of DocFormerV2☆11Mar 31, 2024Updated last year