DRSY / MoTIS
[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)
☆123Updated 2 years ago
Alternatives and similar repositories for MoTIS
Users that are interested in MoTIS are comparing it to the libraries listed below
Sorting:
- Using pretrained encoder and language models to generate captions from multimedia inputs.☆97Updated 2 years ago
- CLIP-Finder enables semantic offline searches of images from gallery photos using natural language descriptions or the camera. Built on A…☆76Updated 9 months ago
- OpenAI CLIP coreML version for iOS text-image embeddings, image search, image clustering, image classifiy☆19Updated last year
- M4 experiment logbook☆57Updated last year
- ☆18Updated last year
- Utility to test the performance of CoreML models.☆70Updated 4 years ago
- PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)☆242Updated 2 years ago
- ☆88Updated last year
- CLIP中文encoder☆22Updated 2 years ago
- ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.☆84Updated last year
- Search photos on Unsplash based on OpenAI's CLIP model, support search with joint image+text queries and attention visualization.☆222Updated 3 years ago
- Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training (ACL 2023))☆90Updated last year
- Easily compute clip embeddings from video frames☆145Updated last year
- Implementation of the deepmind Flamingo vision-language model, based on Hugging Face language models and ready for training☆167Updated 2 years ago
- Get hundred of million of image+url from the crawling at home dataset and preprocess them☆220Updated 11 months ago
- ☆103Updated last year
- ☆131Updated 2 years ago
- ☆64Updated last year
- [ECCV 2022] FashionViL: Fashion-Focused V+L Representation Learning☆61Updated 2 years ago
- Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic☆275Updated 2 years ago
- A repository containing datasets and tools to train a watermark classifier.☆67Updated 2 years ago
- [BMVC22] Official Implementation of ViCHA: "Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment"☆55Updated 2 years ago
- ☆67Updated last year
- PyTorch code for MUST☆106Updated last week
- Language Models Can See: Plugging Visual Controls in Text Generation☆256Updated 2 years ago
- [ECCV2022] Contrastive Vision-Language Pre-training with Limited Resources☆44Updated 2 years ago
- 🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)☆64Updated last year
- A non-JIT version implementation / replication of CLIP of OpenAI in pytorch☆34Updated 4 years ago
- Research publication code for "Forward Compatible Training for Large-Scale Embedding Retrieval Systems", CVPR 2022, and "FastFill: Effici…☆55Updated 2 years ago
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆196Updated last year