Official implementation of the CVPR '25 highlight paper "Compositional Caching for Training-free Open-vocabulary Attribute Detection"
☆23Dec 23, 2024Updated last year
Alternatives and similar repositories for ComCa
Users that are interested in ComCa are comparing it to the libraries listed below
Sorting:
- Official repo of the paper “AL-GTD: Deep Active Learning for Gaze Target Detection” (ACMMM2024)☆12Nov 29, 2024Updated last year
- Code implementation of our ICCV 2025 paper: On Large Multimodal Models as Open-World Image Classifiers☆26Dec 4, 2025Updated 3 months ago
- Official Implementation of MULTI-LANE (Multi Label class incremental learning via summarising pAtch tokeN Embeddings). Published in 3rd C…☆15Feb 20, 2025Updated last year
- Official implementation of "ConViS-Bench: Estimating Video Similarity Through Semantic Concepts", NeurIPS 2025☆25Nov 28, 2025Updated 3 months ago
- [CVPR '25] Official implementation of the paper "Rethinking Few-Shot Adaptation of Vision-Language Models in Two Stages", CVPR 2025.☆30Mar 30, 2025Updated 11 months ago
- [CVPR '23 Highlight] Official repository for the paper "Quantum Multi-Model Fitting".☆11Mar 7, 2025Updated last year
- Official implementation of "Test-Time Zero-Shot Temporal Action Localization", CVPR 2024☆70Sep 11, 2024Updated last year
- [CVPR '24] Official implementation of the paper "Multiflow: Shifting Towards Task-Agnostic Vision-Language Pruning".☆23Mar 7, 2025Updated last year
- [TCSVT23] Official code for "SPT: Spatial Pyramid Transformer for Image Captioning".☆10Aug 14, 2024Updated last year
- [CVPR 2024 Highlight] OpenBias: Open-set Bias Detection in Text-to-Image Generative Models☆26Feb 13, 2025Updated last year
- [ECCV 2024] BUSCA: "Lost and Found: Overcoming Detector Failures in Online Multi-Object Tracking"☆43Dec 6, 2024Updated last year
- Pytorch implementation of "Diversified in-domain synthesis with efficient fine-tuning for few-shot classification"☆19Mar 25, 2024Updated last year
- Code for ICCV 2023 paper ✨ "StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Mo…☆18Jan 25, 2024Updated 2 years ago
- [AAAI'25]: Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP☆19Aug 5, 2025Updated 7 months ago
- Loomis Painter: Reconstructing the painting process☆52Nov 24, 2025Updated 3 months ago
- ☆27Oct 31, 2024Updated last year
- [NeurIPS '24] Frustratingly easy Test-Time Adaptation of VLMs!!☆61Mar 24, 2025Updated 11 months ago
- [CVPR-25🔥] Test-time Counterattacks (TTC) towards adversarial robustness of CLIP☆39Jun 4, 2025Updated 9 months ago
- A collection of awesome think with videos papers.☆91Dec 1, 2025Updated 3 months ago
- The implementation of Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning☆13Apr 14, 2024Updated last year
- Thesis Template☆10Mar 2, 2026Updated last week
- CoMA: Compositional Human Motion Generation with Multi-modal Agents☆14Jul 31, 2025Updated 7 months ago
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- The official GitHub page for the survey paper "Discrete Tokenization for Multimodal LLMs: A Comprehensive Survey". And this paper is unde…☆77Feb 18, 2026Updated 2 weeks ago
- LongCTR: A Long Sequence Modeling Benchmark for CTR Prediction☆17Jun 21, 2025Updated 8 months ago
- [CVPR 2023] "TrojViT: Trojan Insertion in Vision Transformers" by Mengxin Zheng, Qian Lou, Lei Jiang☆14Jan 5, 2024Updated 2 years ago
- Official training code for MUG-V 10B video generation model. Built on Megatron-LM (v0.14.0) with production-ready distributed training fo…☆19Oct 20, 2025Updated 4 months ago
- ☆14Jan 5, 2022Updated 4 years ago
- ☆24Oct 9, 2025Updated 5 months ago
- Code implementation of our BMVC 2022 paper: Cluster-level pseudo-labelling for source-free cross-domain facial expression recognition☆11Dec 18, 2022Updated 3 years ago
- 用于自动预约民政局婚姻登记处的号,限广东省民政局☆10Jun 25, 2023Updated 2 years ago
- ☆13Nov 28, 2021Updated 4 years ago
- [ICCV 2025] Repository for A Quality-Guided Mixture of Score-fusion Experts Framework for Human Recognition☆16Sep 29, 2025Updated 5 months ago
- Official implementation of "In-style: Bridging Text and Uncurated Videos with Style Transfer for Cross-modal Retrieval." ICCV 2023☆11Oct 5, 2023Updated 2 years ago
- The Koudai48 VOD Manager☆10May 2, 2019Updated 6 years ago
- [CVPR 2025] Official implementation of SSP: High Temporal Consistency through Semantic Similarity Propagation in Semi-Supervised Video Se…☆15Jun 26, 2025Updated 8 months ago
- 数模组新生入门手册——长期维护> <(使用GPL许可证 非商用授权 如果使用其中内容请表明出处)☆11Oct 11, 2019Updated 6 years ago
- ☆10Mar 31, 2025Updated 11 months ago
- ☆11Jul 2, 2022Updated 3 years ago