DRSY / MoTISLinks
[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)
☆127Updated 2 years ago
Alternatives and similar repositories for MoTIS
Users that are interested in MoTIS are comparing it to the libraries listed below
Sorting:
- Using pretrained encoder and language models to generate captions from multimedia inputs.☆100Updated 2 years ago
- Utility to test the performance of CoreML models.☆70Updated 5 years ago
- ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.☆86Updated 2 years ago
- Easily compute clip embeddings from video frames☆147Updated 2 years ago
- A repository containing datasets and tools to train a watermark classifier.☆74Updated 3 years ago
- CLIP-Finder enables semantic offline searches of images from gallery photos using natural language descriptions or the camera. Built on A…☆90Updated last year
- ☆18Updated 2 years ago
- We present **FOCI**, a benchmark for Fine-grained Object ClassIfication for large vision language models (LVLMs).☆19Updated last year
- CLIP中文encoder☆22Updated 3 years ago
- It is a simple library to speed up CLIP inference up to 3x (K80 GPU)☆231Updated 2 years ago
- Use CLIP to represent video for Retrieval Task☆70Updated 4 years ago
- A non-JIT version implementation / replication of CLIP of OpenAI in pytorch☆34Updated 5 years ago
- ☆63Updated 3 months ago
- ☆65Updated 2 years ago
- The official PyTorch implementation for arXiv'23 paper 'LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer'☆103Updated 8 months ago
- Get hundred of million of image+url from the crawling at home dataset and preprocess them☆223Updated last year
- codebase for the SIMAT dataset and evaluation☆38Updated 3 years ago
- Search photos on Unsplash based on OpenAI's CLIP model, support search with joint image+text queries and attention visualization.☆224Updated 4 years ago
- Diffusion-based markup-to-image generation☆83Updated 2 years ago
- PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)☆246Updated 8 months ago
- ☆103Updated 2 years ago
- ☆141Updated 3 years ago
- ☆23Updated last year
- Load any clip model with a standardized interface☆22Updated 3 months ago
- A simple web-server/api over a rclip-style clip embedding database.☆32Updated 3 years ago
- ☆87Updated 2 years ago
- Big-Interleaved-Dataset☆58Updated 3 years ago
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Updated last month
- Official implementation of "Active Image Indexing"☆60Updated 2 years ago
- ALIGN trained on COYO-dataset☆29Updated last year