CLIP-based simple image-text matching baseline for COCO and F30K
☆14Sep 16, 2021Updated 4 years ago
Alternatives and similar repositories for Clip_CMR
Users that are interested in Clip_CMR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching☆16Dec 10, 2021Updated 4 years ago
- Based on the WACV 2020 paper - Fine Grained Classification and Retrieval by Combining Visual and Locally Pooled Textual Features☆25Nov 15, 2021Updated 4 years ago
- Scene Text Aware Cross Modal Retrieval (StacMR)☆24Sep 3, 2021Updated 4 years ago
- Implementation on pytorch of the code from the ECCV 2018 paper - Single Shot Scene Text Retrieval☆13Dec 15, 2021Updated 4 years ago
- STVQA and TextVQA OCR results from Amazon Text in Image pipeline☆12Jul 18, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆38Feb 4, 2023Updated 3 years ago
- A Bottom-Up Instance Segmentation Strategy for segmenting document instances using Transformers☆59Sep 9, 2024Updated last year
- Handy Utilities for Computer Vision☆12Updated this week
- Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval☆64Dec 1, 2022Updated 3 years ago
- ☆21Jul 6, 2022Updated 3 years ago
- Measure the diversity of image descriptions, repository for our COLING 2018 paper.☆13Dec 29, 2019Updated 6 years ago
- 空域明文可逆信息隐藏☆11Jul 6, 2020Updated 5 years ago
- Ref-Diff: Zero-shot Referring Image Segmentation with Generative Models☆21May 29, 2025Updated 9 months ago
- This is an official PyTorch code for our accepted paper "When All We Need is a Piece of the Pie: A Generic Framework for Optimizing Two-w…☆15Jul 7, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Professor and Group List of CS☆10Mar 12, 2024Updated 2 years ago
- code for our BMVC 2021 paper "HCV: Hierarchy-Consistency Verification for Incremental Implicitly-Refined Classification"☆15Oct 28, 2022Updated 3 years ago
- Tools to estimate the correlation of different text-based evaluation measures for Automatic Image Description☆10Feb 2, 2017Updated 9 years ago
- The reproduce of paper "Continual Vision-Language Representation Learning with Off-Diagonal Information ".(Mod-X)☆11Oct 31, 2023Updated 2 years ago
- This repository contains implementation of DHNE : Network Representation Learning Method for Dynamic Heterogeneous Network.☆10May 11, 2019Updated 6 years ago
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆38Jan 25, 2024Updated 2 years ago
- A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts …☆13Jul 13, 2022Updated 3 years ago
- code for the paper "Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation" (TPAMI 2021)☆10Jul 15, 2022Updated 3 years ago
- Let there be clock in the beach - WACV 2022☆15Nov 15, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- auto_labeler - An all-in-one library to automatically label vision data☆21Jan 17, 2025Updated last year
- Inspired by the success and computational efficiency of convolutional architectures for various sequential tasks compared to recurrent ne…☆19Jan 23, 2018Updated 8 years ago
- VIsually-Pivoted Audio and(N) Text☆22May 16, 2022Updated 3 years ago
- TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers☆21Jul 26, 2022Updated 3 years ago
- Code for MICCAI 2021 submission 'Self-Supervised Multi-Modal Alignment For Whole Body Medical Imaging'☆16Sep 22, 2021Updated 4 years ago
- Repository for AAAI 2018 paper "Using Syntax for Referring Expression Recognition"☆13Oct 7, 2020Updated 5 years ago
- ☆17Oct 22, 2020Updated 5 years ago
- Multi-Label Classification and Class Activation Map on Fashion MNIST☆11Mar 5, 2019Updated 7 years ago
- Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation" in NeurIPS…☆14Dec 9, 2021Updated 4 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- [ECCV 2024] The official PyTorch implementation of the "Plain-Det: A Plain Multi-Dataset Object Detector".☆30Dec 8, 2024Updated last year
- ☆13Dec 8, 2022Updated 3 years ago
- code for the paper "ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts" (CVPR 2022)☆10Jul 17, 2022Updated 3 years ago
- A hands-on & simple tutorial for out-of-distribution generalization.☆18Apr 23, 2022Updated 3 years ago
- ☆35Mar 9, 2026Updated 2 weeks ago
- ☆24Dec 22, 2016Updated 9 years ago
- Code and data for "Learning Program Representations for Food Images and Cooking Recipes" (oral at CVPR 2022)☆15Mar 30, 2022Updated 3 years ago