LAION-AI / model-retrievalLinks
Easily compute model embeddings and save the embeddings.
☆10Updated 2 years ago
Alternatives and similar repositories for model-retrieval
Users that are interested in model-retrieval are comparing it to the libraries listed below
Sorting:
- ☆35Updated last year
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆27Updated last year
- [ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model☆43Updated 9 months ago
- [CVPR-2023] The official dataset of Advancing Visual Grounding with Scene Knowledge: Benchmark and Method.☆32Updated 2 years ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆60Updated 2 years ago
- Official Code of ECCV 2022 paper MS-CLIP☆90Updated 3 years ago
- Training code for CLIP-FlanT5☆29Updated last year
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Updated 4 years ago
- ☆13Updated 3 years ago
- Use CLIP to represent video for Retrieval Task☆70Updated 4 years ago
- FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions☆55Updated last year
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆40Updated 2 years ago
- VideoCC is a dataset containing (video-URL, caption) pairs for training video-text machine learning models. It is created using an automa…☆78Updated 2 years ago
- Research code for "Training Vision-Language Transformers from Captions Alone"☆34Updated 3 years ago
- TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"☆35Updated 3 years ago
- LAVIS - A One-stop Library for Language-Vision Intelligence☆49Updated last year
- Code for the Video Similarity Challenge.☆80Updated last year
- Official repository for the General Robust Image Task (GRIT) Benchmark☆54Updated 2 years ago
- Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496☆92Updated 5 months ago
- ☆53Updated 3 years ago
- Turning to Video for Transcript Sorting☆48Updated 2 years ago
- Official code for "Disentangling Visual Embeddings for Attributes and Objects" Published at CVPR 2022☆35Updated 2 years ago
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Updated last year
- ☆30Updated 2 years ago
- [ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …☆76Updated last year
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆24Updated 10 months ago
- [WACV2025 Oral] DeepMIM: Deep Supervision for Masked Image Modeling☆53Updated 4 months ago
- ☆24Updated 2 years ago
- ☆58Updated last year
- Using LLMs and pre-trained caption models for super-human performance on image captioning.☆42Updated last year