GZU-SAMLab / LCM-CaptionerLinks
LCM-Captioner is an efficient model for Text-based Image Captioning(TextCap).
☆26Updated 2 years ago
Alternatives and similar repositories for LCM-Captioner
Users that are interested in LCM-Captioner are comparing it to the libraries listed below
Sorting:
- We propose a text-guided image inpainting method with multi-grained image-text semantic learning (MISL), consisting of global-local gener…☆27Updated 2 years ago
- Meta-contrastive Learning with Support-based Query Interaction for Few-shot Fine-grained Visual Classification☆33Updated 2 years ago
- Phenotype segmentation method based on spectral reconstruction for UAV field vegetation.☆28Updated 2 years ago
- Mutil-stage knowledge distillation (MSKD) can facilitate the accuracy of plant disease detection, which may be a new and vital direction …☆28Updated 2 years ago
- Count-Supervised Network (CSNet) can complete the counting of wheat ears with only quantitative supervision. CSNet: A Count-supervised N…☆32Updated last year
- Common and Distinct Knowledge Mining Network with Content Interaction for Dense Captioning☆29Updated 2 years ago
- AA-trans: Core attention aggregating transformer with informationentropy selector for fine-grained visual classification☆37Updated 2 years ago
- T3Bench: Benchmarking Current Progress in Text-to-3D Generation☆1,100Updated 2 years ago
- ☆1,129Updated last year
- VMamba: Visual State Space Models,code is based on mamba☆3,039Updated 11 months ago
- ☆938Updated 2 years ago
- [ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model☆3,792Updated 11 months ago
- 这里包含了Vit的代码以及数据集部分。☆133Updated last year
- [ACM'MM 2025] UAV Street-Satellite matching workshop Challenging paper, SkyLink: Unifying Street-Satellite Geo-Localization via UAV-Media…☆24Updated 2 months ago
- About [MM2024] Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression Recognition☆13Updated last year
- ICCV 2025 论文和开源项目合集☆2,851Updated 7 months ago
- [Official Repo] Visual Mamba: A Survey and New Outlooks☆731Updated 11 months ago
- Stanford University CS231n 2016 winter assignments☆46Updated 3 years ago
- Image Processing学习,学习教程:https://github.com/WZMIAOMIAO/deep-learning-for-image-processing 视频对应:https://space.bilibili.com/18161609☆139Updated last year
- Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)☆2,169Updated last year
- (TPAMI 2024) A Survey on Open Vocabulary Learning☆986Updated last month
- 2025年全网最全即插即用模块,免费分享!CVPR2025,AAAI2025,ICLR2025,TNNLS2025,arXiv2025......包含人工智能全领域(机器学习、深度学习等),适用于图像分类、目标检测、实例分割、语义分割、全景分割、姿态识别、医学图像分割、视频…☆1,414Updated 8 months ago
- A curated publication list on open vocabulary semantic segmentation and related area (e.g. zero-shot semantic segmentation) resources..☆827Updated 3 weeks ago
- AAAI 2024 Papers: Explore a comprehensive collection of innovative research papers presented at one of the premier artificial intelligenc…☆594Updated last year
- Implementation of a Tea Classification WeChat Mini Program Based on Deep Learning. (一个基于深度学习的茶叶分类小程序实现)☆19Updated last year
- [CVPR 2023] Official implementation for "CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion."☆600Updated last year
- ❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119☆1,214Updated 2 years ago
- [CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".☆803Updated 2 years ago
- ECCV 2024 论文和开源项目合集,同时欢迎各位大佬提交issue,分享ECCV 2024论文和开源项目☆2,278Updated last year
- 深度学习中各种即插即用小模块☆464Updated last year