ForJadeForest / ImageSearchLightningCLIP
Using distilled CLIP model to deploy the android device
☆19Updated last year
Related projects ⓘ
Alternatives and complementary repositories for ImageSearchLightningCLIP
- A demo for running quantized CLIP model (ViT-B/32) on Android.☆35Updated last year
- CLIP⚡NCNN⚡基于自然语言的图片搜索(Image Search)⚡以字搜图⚡x86⚡Android☆223Updated last year
- Research Code for Multimodal-Cognition Team in Ant Group☆123Updated 4 months ago
- CLIP中文encoder☆21Updated 2 years ago
- The deployment of deep learning models for mobile platforms includes some common CV and NLP tasks.☆17Updated 2 weeks ago
- ☆55Updated 10 months ago
- The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".☆221Updated 9 months ago
- Workshop on Foundation Model 1st foundation model challenge Track1 codebase (Open TransMind v1.0)☆18Updated last year
- [NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)☆121Updated last year
- 本项目使用LLaVA 1.6多模态模型实现以文搜图和以图搜图功能。☆17Updated 8 months ago
- segment anything model (SAM) infer by ncnn on Android mobile phone☆27Updated last year
- 基于PaddleSeg的ModNet算法实现人像抠图(安卓版demo)☆58Updated 3 years ago
- 该项目旨在通过输入文本描述来检索与之相匹配的图片。☆26Updated last year
- ☆66Updated last year
- Multimodal chatbot with computer vision capabilities integrated☆99Updated 6 months ago
- ☆156Updated 8 months ago
- Vary-tiny codebase upon LAVIS (for training from scratch)and a PDF image-text pairs data (about 600k including English/Chinese)☆68Updated 2 months ago
- [IJCAI 2024] CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning☆22Updated 9 months ago
- Official implementation of "Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer".☆119Updated 2 weeks ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆35Updated 2 months ago
- [ICCV2023] TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance☆66Updated 4 months ago
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆53Updated 3 weeks ago
- InstructionGPT-4☆37Updated 10 months ago
- The official code for NeurIPS 2024 paper: Harmonizing Visual Text Comprehension and Generation☆76Updated this week
- 个人项目地址,一些大语言模型和多模态模型的应用☆123Updated 2 weeks ago
- Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original c…☆21Updated last week
- ☆14Updated last year
- Pytorch分布式训练框架☆71Updated this week
- [CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge☆122Updated 4 months ago
- The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.☆69Updated 2 months ago