vijishmadhavan / Crop-CLIP
Crop using CLIP
☆339Updated 2 years ago
Alternatives and similar repositories for Crop-CLIP:
Users that are interested in Crop-CLIP are comparing it to the libraries listed below
- ☆657Updated last year
- Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch☆547Updated 2 years ago
- CLIP Object Detection, search object on image using natural language #Zeroshot #Unsupervised #CLIP #ODS☆139Updated 3 years ago
- Contrastive Language-Image Forensic Search allows free text searching through videos using OpenAI's machine learning model CLIP☆472Updated 3 years ago
- Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic☆275Updated 2 years ago
- Search photos on Unsplash based on OpenAI's CLIP model, support search with joint image+text queries and attention visualization.☆222Updated 3 years ago
- Code release for SLIP Self-supervision meets Language-Image Pre-training☆766Updated 2 years ago
- ☆1,177Updated 2 years ago
- PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)☆242Updated 2 years ago
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm☆653Updated 2 years ago
- Language Models Can See: Plugging Visual Controls in Text Generation☆256Updated 2 years ago
- OpenAI CLIP text encoders for multiple languages!☆796Updated last year
- ☆335Updated 2 years ago
- Repository for "Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search"☆180Updated 3 years ago
- Using CLIP and StyleGAN to generate faces from prompts.☆131Updated 3 years ago
- GIT: A Generative Image-to-text Transformer for Vision and Language☆566Updated last year
- Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]☆360Updated 2 years ago
- Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors☆336Updated 2 years ago
- ☆269Updated 5 months ago
- Omnivore: A Single Model for Many Visual Modalities☆563Updated 2 years ago
- ☆1,000Updated 2 years ago
- Robust fine-tuning of zero-shot models☆699Updated 3 years ago
- A concise but complete implementation of CLIP with various experimental improvements from recent papers☆709Updated last year
- Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch☆532Updated last year
- Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.☆390Updated 2 years ago
- Get hundred of million of image+url from the crawling at home dataset and preprocess them☆220Updated 11 months ago
- A PyTorch Lightning solution to training OpenAI's CLIP from scratch.☆691Updated 3 years ago
- ☆351Updated 3 years ago
- ☆111Updated 3 years ago
- ☆198Updated 3 years ago