[ICML2024] Official PyTorch implementation of CoMC: Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition
☆16Jul 9, 2024Updated last year
Alternatives and similar repositories for CoMC
Users that are interested in CoMC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR2025] Official implementation of RAM☆29Nov 4, 2025Updated 4 months ago
- ☆17Aug 8, 2024Updated last year
- Shared Attention for Multi-label Zero-shot Learning accepted @ CVPR20☆32Dec 21, 2021Updated 4 years ago
- About [MM2024] Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression Recognition☆13Nov 12, 2024Updated last year
- [CBMI 2024 Best Paper] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".☆32May 12, 2025Updated 10 months ago
- Implementation for "DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations" (NeurIPS 2022))☆71Oct 24, 2023Updated 2 years ago
- [ICME 2023, Oral] HybridPoint: Point cloud registration based on hybrid point sampling and matching☆29Mar 14, 2024Updated 2 years ago
- [NeurIPS 2025] FastVID: Dynamic Density Pruning for Fast Video Large Language Models☆31Nov 10, 2025Updated 4 months ago
- [ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"☆58Sep 3, 2024Updated last year
- [CVPR2024] Improving Generalized Zero-Shot Learning by Exploring the Diverse Semantics from External Class Names☆21Nov 6, 2024Updated last year
- ☆14Sep 20, 2025Updated 6 months ago
- [AAAI 2025] RCTrans: Radar-Camera Transformer via Radar Densiffer and Sequential Decoder for 3D Object Detection☆41Mar 14, 2025Updated last year
- The official code and model for ACL 2023 paper 'mCLIP: Multilingual CLIP via Cross-lingual Transfer'☆10Jan 23, 2024Updated 2 years ago
- [ICCV 2023] ViLLA: Fine-grained vision-language representation learning from real-world data☆45Oct 15, 2023Updated 2 years ago
- Label Studio is a multi-type data labeling and annotation tool with standardized output format☆10Nov 17, 2021Updated 4 years ago
- Unsupervised Cross-lingual Sentiment Analysis (CoNLL 2019)☆10Nov 4, 2019Updated 6 years ago
- ☆14Jan 5, 2022Updated 4 years ago
- ☆14Oct 14, 2019Updated 6 years ago
- ☆13Jan 5, 2022Updated 4 years ago
- A vision-language model with an improved cross-attention mechanism for scalable streaming inference☆28Mar 9, 2026Updated 2 weeks ago
- Fine-Grained Knowledge Fusion for Retrieval-Augmented Medical Visual Question☆11Jul 18, 2024Updated last year
- VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model☆15Jul 31, 2025Updated 7 months ago
- Linguistic-Aware Patch Slimming Framework for Fine-grained Cross-Modal Alignment, CVPR, 2024☆108Jun 26, 2025Updated 8 months ago
- ☆16Feb 23, 2025Updated last year
- [ACL Main 2025] I0T: Embedding Standardization Method Towards Zero Modality Gap☆12Jun 18, 2025Updated 9 months ago
- ☆26Aug 23, 2022Updated 3 years ago
- ☆95Sep 23, 2023Updated 2 years ago
- Code & data for IJCAI'22 paper "Recipe2Vec: Multi-modal Recipe Representation Learning with Graph Neural Networks".☆14Jul 24, 2022Updated 3 years ago
- [ECCV 2024] Official PyTorch implementation of LUT "Learning with Unmasked Tokens Drives Stronger Vision Learners"☆13Dec 1, 2024Updated last year
- ☆10Oct 14, 2020Updated 5 years ago
- Source code for NAACL 2022 paper Weakly Supervised Text Classification using Supervision Signals from a Language Mode☆10Jun 13, 2022Updated 3 years ago
- This repo is the implementation of "A Neural Topic-Attention Model for Medical Term Abbreviation Disambiguation".☆15Dec 3, 2019Updated 6 years ago
- Official implementation of "In-style: Bridging Text and Uncurated Videos with Style Transfer for Cross-modal Retrieval." ICCV 2023☆11Oct 5, 2023Updated 2 years ago
- The implementation of Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning☆13Apr 14, 2024Updated last year
- ☆17May 31, 2023Updated 2 years ago
- Official implementation of the paper “Endowing Vision-Language Models with System 2 Thinking for Fine-Grained Visual Recognition,” AAAI 2…☆34Jan 30, 2026Updated last month
- ☆25Oct 9, 2025Updated 5 months ago
- An annotation tool for rapid multi-task collaborative information extraction for knowledge graph construction.☆21Jun 12, 2025Updated 9 months ago
- [ Arxiv 2023 ] This repository contains the code for "MUPPET: Multi-Modal Few-Shot Temporal Action Detection"☆15Aug 30, 2023Updated 2 years ago