hiaac-nlp / CAPIVARALinks
β28Updated 11 months ago
Alternatives and similar repositories for CAPIVARA
Users that are interested in CAPIVARA are comparing it to the libraries listed below
Sorting:
- π A Large-scale Multi-modal E-Commerce Products Dataset (LTDL@IJCAI-21 Best Dataset & Pattern Recognition 2023)β32Updated last year
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Trainingβ138Updated 2 years ago
- CLIP4IDC: CLIP for Image Difference Captioning (AACL 2022)β34Updated 2 years ago
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learningβ39Updated last year
- β86Updated last year
- β29Updated 2 years ago
- Code and results accompanying our paper titled CHiLS: Zero-Shot Image Classification with Hierarchical Label Setsβ57Updated 2 years ago
- β46Updated 3 years ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"β59Updated 2 years ago
- NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)β48Updated last year
- [CVPR 2022 - Demo Track] - Effective conditioned and composed image retrieval combining CLIP-based featuresβ81Updated 8 months ago
- Official implementation of the Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT) | ICCV 2021 - Image Retrieval oβ¦β39Updated last year
- β33Updated last year
- β52Updated 2 years ago
- SimVLM ---SIMPLE VISUAL LANGUAGE MODEL PRETRAINING WITH WEAK SUPERVISIONβ36Updated 2 years ago
- β120Updated 2 years ago
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)β135Updated last year
- [ACM TOMM 2023] - Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Featuresβ180Updated last year
- Official repository for the ICCV 2023 paper: "Waffling around for Performance: Visual Classification with Random Words and Broad Conceptsβ¦β57Updated 2 years ago
- Code for our ICLR 2024 paper "PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts"β77Updated last year
- [ICML 2022] This is the pytorch implementation of "Rethinking Attention-Model Explainability through Faithfulness Violation Test" (https:β¦β19Updated 3 years ago
- [CVPR 2023 (Highlight)] FAME-ViL: Multi-Tasking V+L Model for Heterogeneous Fashion Tasksβ53Updated last year
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)β197Updated last year
- PyTorch code for MUSTβ107Updated 2 months ago
- β17Updated 11 months ago
- Official implementation for NeurIPS'23 paper "Geodesic Multi-Modal Mixup for Robust Fine-Tuning"β34Updated 10 months ago
- Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"β34Updated 2 years ago
- β59Updated last year
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have bβ¦β77Updated last month
- This is the official implementation of the paper "MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Visionβ¦β31Updated last year