GZU-SAMLab / LCM-CaptionerLinks
LCM-Captioner is an efficient model for Text-based Image Captioning(TextCap).
☆26Updated 2 years ago
Alternatives and similar repositories for LCM-Captioner
Users that are interested in LCM-Captioner are comparing it to the libraries listed below
Sorting:
- Phenotype segmentation method based on spectral reconstruction for UAV field vegetation.☆26Updated last year
- We propose a text-guided image inpainting method with multi-grained image-text semantic learning (MISL), consisting of global-local gener…☆27Updated last year
- Meta-contrastive Learning with Support-based Query Interaction for Few-shot Fine-grained Visual Classification☆33Updated last year
- Mutil-stage knowledge distillation (MSKD) can facilitate the accuracy of plant disease detection, which may be a new and vital direction …☆28Updated last year
- Count-Supervised Network (CSNet) can complete the counting of wheat ears with only quantitative supervision. CSNet: A Count-supervised N…☆29Updated last year
- Common and Distinct Knowledge Mining Network with Content Interaction for Dense Captioning☆29Updated last year
- AA-trans: Core attention aggregating transformer with informationentropy selector for fine-grained visual classification☆34Updated last year
- T3Bench: Benchmarking Current Progress in Text-to-3D Generation☆1,099Updated last year
- ☆1,069Updated last year
- Vim with chinese notation☆20Updated last year
- ☆937Updated last year
- 计算机视觉相关综述。包括目标检测、跟踪........☆2,104Updated this week
- A curated publication list on open vocabulary semantic segmentation and related area (e.g. zero-shot semantic segmentation) resources..☆679Updated 3 months ago
- ☆17Updated 4 months ago
- VMamba: Visual State Space Models,code is based on mamba☆2,718Updated 4 months ago
- About [MM2024] Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression Recognition☆11Updated 8 months ago
- [ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model☆3,512Updated 5 months ago
- Labeling tool with SAM(segment anything model),supports SAM, SAM2, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具☆1,663Updated 2 months ago
- (TPAMI 2024) A Survey on Open Vocabulary Learning☆942Updated 4 months ago
- This repository is intended to store the code and data for ASAP (Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting…☆13Updated last month
- ☆251Updated last year
- [ChinaMM2025] 非空间配准多模态目标检测决策融合策略☆38Updated last week
- ☆12Updated last year
- [TMM-2025] The official implementation of "IVAC-P2L: Leveraging Irregular Repetition Priors for Improving Video Action Counting".☆24Updated 3 months ago
- MambaOut: Do We Really Need Mamba for Vision? (CVPR 2025)☆2,467Updated 4 months ago
- 这里包含了Vit的代码以及数据集部分。☆129Updated last year
- Adapting Meta AI's Segment Anything to Downstream Tasks with Adapters and Prompts☆1,224Updated 6 months ago
- 深度学习即插即用模块代码复现☆95Updated 3 months ago
- [CVPR 2024] Code release for TransNeXt model☆538Updated last year
- A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.☆627Updated this week