ZYiJie / text2img
模式识别课设代码:图文生成(CLIP+DALLE+BriVL)
☆19Updated last year
Related projects ⓘ
Alternatives and complementary repositories for text2img
- Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.☆164Updated 2 years ago
- transformers结构的中文OFA模型☆123Updated last year
- 中文CLIP:自定义数据集,可根据文图提取向量,实现文图匹配。☆21Updated 2 years ago
- ☆32Updated 2 years ago
- ☆57Updated last year
- Bling's Object detection tool☆55Updated last year
- ☆157Updated last year
- ☆63Updated 10 months ago
- Bridging Vision and Language Model☆279Updated last year
- 基于ClipCap的看图说话Image Caption模型☆283Updated 2 years ago
- pytorch implementation of mvp: a multi-stage vision-language pre-training framework☆33Updated last year
- 该项目旨在通过输入文本描述来检索与之相匹配的图片。☆26Updated last year
- ☆66Updated last year
- Enriching MS-COCO with Chinese sentences and tags for cross-lingual multimedia tasks☆179Updated last year
- TaiSu(太素)--a large-scale Chinese multimodal dataset(亿级大规模中文视觉语言预训练数据 集)☆175Updated 11 months ago
- Code for ICCV 2023 Paper : “ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction”☆50Updated last year
- ☆100Updated 3 years ago
- Search photos on Unsplash based on OpenAI's CLIP model, support search with joint image+text queries and attention visualization.☆208Updated 3 years ago
- Cross-lingual image captioning☆83Updated 2 years ago
- breezedeus的各种分享☆22Updated last year
- Source code for the paper "Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granular…☆38Updated last year
- Baichuan-13B 指令微调☆89Updated last year
- [AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”☆214Updated 7 months ago
- [NAACL 2022 Findings] Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extrac…☆98Updated last year
- ☆15Updated 7 months ago
- 计算机视觉课程设计-基于Chinese-CLIP的图文检索系统☆47Updated last year
- The official code for the CVPR 2024 paper: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer☆44Updated 4 months ago