LAION-AI / model-retrievalLinks
Easily compute model embeddings and save the embeddings.
☆10Updated 3 years ago
Alternatives and similar repositories for model-retrieval
Users that are interested in model-retrieval are comparing it to the libraries listed below
Sorting:
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆28Updated 2 years ago
- Training code for CLIP-FlanT5☆30Updated last year
- ☆19Updated 2 years ago
- [CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.☆32Updated 2 years ago
- FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions☆56Updated last year
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Updated last year
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Updated 4 years ago
- TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"☆36Updated 4 years ago
- ☆35Updated last year
- ☆25Updated 2 years ago
- ☆55Updated 2 years ago
- A image caption dataset about images from www.dpchallenge.com.☆19Updated 6 years ago
- Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"☆30Updated last year
- Official Code of ECCV 2022 paper MS-CLIP☆91Updated 3 years ago
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆35Updated last year
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆61Updated 2 years ago
- [WACV2025 Oral] DeepMIM: Deep Supervision for Masked Image Modeling☆55Updated 7 months ago
- [ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model☆43Updated last year
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆82Updated 2 years ago
- [ECCV2022] Motion Sensitive Contrastive Learning for Self-supervised Video Representation☆17Updated 3 years ago
- Official repo for StableLLAVA☆95Updated 2 years ago
- ImaginaryNet: Learning Object Detectors without Real Images and Annotations☆26Updated 2 years ago
- ☆58Updated last year
- ☆14Updated 3 years ago
- ORES: Open-vocabulary Responsible Visual Synthesis☆13Updated 2 years ago
- Code Release for the paper "Make-A-Story: Visual Memory Conditioned Consistent Story Generation" in CVPR 2023☆43Updated 2 years ago
- Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.☆101Updated 9 months ago
- Video dataset dedicated to portrait-mode video recognition.☆55Updated 2 months ago
- A curated list of papers and resources for text-to-image evaluation.☆30Updated 2 years ago
- Official implementation of TagAlign☆35Updated last year