GZU-SAMLab / LCM-CaptionerLinks
LCM-Captioner is an efficient model for Text-based Image Captioning(TextCap).
☆26Updated 2 years ago
Alternatives and similar repositories for LCM-Captioner
Users that are interested in LCM-Captioner are comparing it to the libraries listed below
Sorting:
- We propose a text-guided image inpainting method with multi-grained image-text semantic learning (MISL), consisting of global-local gener…☆27Updated 2 years ago
- Phenotype segmentation method based on spectral reconstruction for UAV field vegetation.☆27Updated 2 years ago
- Meta-contrastive Learning with Support-based Query Interaction for Few-shot Fine-grained Visual Classification☆33Updated 2 years ago
- Mutil-stage knowledge distillation (MSKD) can facilitate the accuracy of plant disease detection, which may be a new and vital direction …☆28Updated 2 years ago
- Count-Supervised Network (CSNet) can complete the counting of wheat ears with only quantitative supervision. CSNet: A Count-supervised N…☆31Updated last year
- AA-trans: Core attention aggregating transformer with informationentropy selector for fine-grained visual classification☆36Updated 2 years ago
- Common and Distinct Knowledge Mining Network with Content Interaction for Dense Captioning☆29Updated 2 years ago
- ☆1,104Updated last year
- ICCV 2025 论文和开源项目合集☆2,733Updated 3 months ago
- ECCV 2024 论文和开源项目合集,同时欢迎各位大佬提交issue,分享ECCV 2024论文和开源项目☆2,240Updated last year
- Dual Pseudo-Labels Interactive Self-Training for Semi-Supervised Visible-Infrared Person Re-Identification☆12Updated last year
- ☆937Updated last year
- [中国图象图形学报&ChinaMM2025] 非空间配 准多模态目标检测决策融合策略☆39Updated 3 months ago
- About [MM2024] Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression Recognition☆12Updated 11 months ago
- Multi-view dual attention network for 3D object recognition (Neural Computing and Applications, 2021)☆12Updated 3 years ago
- ☆14Updated last year
- [NeurIPS 2024 spotlight] Offical implementation of MSFA and release of SARDet_100K dataset for Large-Scale Synthetic Aperture Radar (SAR…☆625Updated 5 months ago
- ❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119☆1,170Updated 2 years ago
- The Pytorch implemetation of "FeatWalk: Enhancing Few-Shot Classification through Local View Leveraging", AAAI 2024.☆11Updated last year
- a new framework for animal behavior automated recognition and measurement☆12Updated 4 months ago
- [Neural Networks 2025]Text-guided Image Restoration and Semantic Enhancement for Text-to-Image Person Retrieval☆11Updated 9 months ago
- ☆640Updated last year
- 计算机视觉相关综述。包括目标检测、跟踪........☆2,174Updated this week
- A curated publication list on open vocabulary semantic segmentation and related area (e.g. zero-shot semantic segmentation) resources..☆737Updated last week
- Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)☆2,087Updated last year
- 2025年全网最全即插即用模块,免费分享!CVPR2025,AAAI2025,ICLR2025,TNNLS2025,arXiv2025......包含人工智能全领域(机器学习、深度学习等),适用于图像分类、目标检测、实例分割、语义分割、全景分割、姿态识别、医学图像分割、视频…☆1,193Updated 4 months ago
- VMamba: Visual State Space Models,code is based on mamba☆2,845Updated 7 months ago
- 这里包含了Vit的代码以及数据集部分。☆129Updated last year
- Implementation of a Tea Classification WeChat Mini Program Based on Deep Learning. (一个基于深度学习的茶叶分类小程序实现)☆17Updated last year
- ☆17Updated 2 months ago