cardinalblue / clip-models-for-distillation
☆18Updated last year
Related projects ⓘ
Alternatives and complementary repositories for clip-models-for-distillation
- A non-JIT version implementation / replication of CLIP of OpenAI in pytorch☆34Updated 3 years ago
- ☆25Updated 3 years ago
- ☆32Updated 2 years ago
- [FGVC9-CVPR 2022] The second place solution for 2nd eBay eProduct Visual Search Challenge.☆26Updated 2 years ago
- Research code for "Training Vision-Language Transformers from Captions Alone"☆33Updated 2 years ago
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆17Updated 3 years ago
- codebase for the SIMAT dataset and evaluation☆38Updated 2 years ago
- ☆43Updated 3 years ago
- This is the official repository for CookGAN: Meal Image Synthesis from Ingredients☆23Updated last year
- ☆47Updated 3 years ago
- MDMMT: Multidomain Multimodal Transformer for Video Retrieval☆26Updated 3 years ago
- ☆22Updated 2 years ago
- Use CLIP to represent video for Retrieval Task☆69Updated 3 years ago
- Official repository for the General Robust Image Task (GRIT) Benchmark☆50Updated last year
- ☆11Updated 4 years ago
- [ECCV2022] Contrastive Vision-Language Pre-training with Limited Resources☆44Updated 2 years ago
- Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model☆39Updated last year
- Script and models for clustering LAION-400m CLIP embeddings.☆25Updated 2 years ago
- ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration☆56Updated last year
- CLIP4IDC: CLIP for Image Difference Captioning (AACL 2022)☆28Updated 2 years ago
- Command-line tool for downloading and extending the RedCaps dataset.☆45Updated 11 months ago
- Official code repository for the EMNLP 2021 paper☆26Updated 2 years ago
- Simple script to compute CLIP-based scores given a DALL-e trained model.☆30Updated 3 years ago
- [NeurIPS 2021] ORL: Unsupervised Object-Level Representation Learning from Scene Images☆58Updated 2 years ago
- Code for ICCV2021: Discovering Human Interactions with Large-Vocabulary Objects via Query and Multi-Scale Detection☆24Updated 3 years ago
- CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification - 4th Workshop on Computer Vision for Fashion, Art, and Design☆27Updated 2 years ago
- ☆28Updated 4 years ago
- [BMVC22] Official Implementation of ViCHA: "Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment"☆54Updated 2 years ago