CV-Magician / MMM-CLIP
Multi-label-image-classification with Multi-method CLIP
☆18Updated last year
Alternatives and similar repositories for MMM-CLIP:
Users that are interested in MMM-CLIP are comparing it to the libraries listed below
- ☆41Updated last year
- [CVPR 2024] Code for our Paper "DeiT-LT: Distillation Strikes Back for Vision Transformer training on Long-Tailed Datasets"☆42Updated 3 months ago
- ☆36Updated 2 years ago
- LongShortNet for Streaming Perception task.☆13Updated last year
- ☆13Updated 7 months ago
- This is code of paper "ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer"☆26Updated last year
- CVPR2024☆77Updated last month
- Official Implementation of Attentive Mask CLIP (ICCV2023, https://arxiv.org/abs/2212.08653)☆31Updated 10 months ago
- ☆75Updated last year
- ☆18Updated 2 years ago
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆52Updated 5 months ago
- Unofficial Implementation to CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification [ICCV'23]☆23Updated 10 months ago
- [AAAI 2024] TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP Without Training☆82Updated last year
- [Pattern Recognition] Mix-ViT: Mixing Attentive Vision Transformer for Ultra-Fine-Grained Visual Categorization.☆21Updated last year
- The official repository implement of Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with…☆66Updated 4 months ago
- ☆25Updated last year
- Video Feature Enhancement with PyTorch☆28Updated 4 months ago
- Official Pytorch Implementation of Self-emerging Token Labeling☆33Updated last year
- Few-shot Object Counting and Detection (ECCV 2022)☆70Updated 5 months ago
- [CVPR2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detect…☆53Updated 3 weeks ago
- Official PyTorch implementation of ResFormer: Scaling ViTs with Multi-Resolution Training, CVPR2023☆27Updated last year
- [ICCV 2023] ALIP: Adaptive Language-Image Pre-training with Synthetic Caption☆98Updated last year
- Generating Image Specific Text☆27Updated last year
- This is an official implementation for [ICLR'24] INTR: Interpretable Transformer for Fine-grained Image Classification.☆49Updated last year
- [CVPR'23] A Simple Framework for Text-Supervised Semantic Segmentation☆59Updated 3 months ago
- PyTorch implementation of PaCa-ViT (CVPR'23)☆29Updated 2 years ago
- Code for "Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection"☆30Updated last year
- [CVPR 2025 Highlight] Official Pytorch codebase for paper: "Assessing and Learning Alignment of Unimodal Vision and Language Models"☆33Updated last week
- [CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners☆42Updated last year
- CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation☆70Updated 8 months ago