enrico310786 / image_text_retrieval_BLIP_BLIP2Links
Experiments with LAVIS library to perform image2text and text2image retrieval with BLIP and BLIP2 models
☆15Updated last year
Alternatives and similar repositories for image_text_retrieval_BLIP_BLIP2
Users that are interested in image_text_retrieval_BLIP_BLIP2 are comparing it to the libraries listed below
Sorting:
- Research Code for Multimodal-Cognition Team in Ant Group☆165Updated 2 months ago
- The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".☆251Updated last year
- Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM☆101Updated last year
- ☆79Updated last year
- 支持中英文双语视觉-文本对话的开源可商用多模态模型。☆377Updated last year
- Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks☆301Updated last year
- ☆87Updated last year
- The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.☆97Updated 3 months ago
- [ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval☆223Updated 4 months ago
- transformers结构的中文OFA模型☆137Updated 2 years ago
- Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"