CLIP-based simple image-text matching baseline for COCO and F30K
☆14Sep 16, 2021Updated 4 years ago
Alternatives and similar repositories for Clip_CMR
Users that are interested in Clip_CMR are comparing it to the libraries listed below
Sorting:
- WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching☆16Dec 10, 2021Updated 4 years ago
- Based on the WACV 2020 paper - Fine Grained Classification and Retrieval by Combining Visual and Locally Pooled Textual Features☆25Nov 15, 2021Updated 4 years ago
- Scene Text Aware Cross Modal Retrieval (StacMR)☆24Sep 3, 2021Updated 4 years ago
- Implementation on pytorch of the code from the ECCV 2018 paper - Single Shot Scene Text Retrieval☆13Dec 15, 2021Updated 4 years ago
- STVQA and TextVQA OCR results from Amazon Text in Image pipeline☆12Jul 18, 2022Updated 3 years ago
- A Bottom-Up Instance Segmentation Strategy for segmenting document instances using Transformers☆59Sep 9, 2024Updated last year
- Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval☆64Dec 1, 2022Updated 3 years ago
- ☆38Feb 4, 2023Updated 3 years ago
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆38Jan 25, 2024Updated 2 years ago
- [NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agen…☆39Feb 14, 2026Updated 2 weeks ago
- [ICCV 2023] "TRM-UAP: Enhancing the Transferability of Data-Free Universal Adversarial Perturbation via Truncated Ratio Maximization", Yi…☆13Jul 17, 2024Updated last year
- Handy Utilities for Computer Vision☆12Updated this week
- Optocal Character Recognition (OCR / HTR) using Transformers☆11Aug 20, 2022Updated 3 years ago
- ☆15Feb 11, 2025Updated last year
- Professor and Group List of CS☆10Mar 12, 2024Updated last year
- Tools to estimate the correlation of different text-based evaluation measures for Automatic Image Description☆10Feb 2, 2017Updated 9 years ago
- Measure the diversity of image descriptions, repository for our COLING 2018 paper.☆13Dec 29, 2019Updated 6 years ago
- ☆20Feb 3, 2025Updated last year
- This repository contains implementation of DHNE : Network Representation Learning Method for Dynamic Heterogeneous Network.☆10May 11, 2019Updated 6 years ago
- Code for CLVision workshop (CVPR 2024) paper - Calibrating Higher-Order Statistics for Few-Shot Class-Incremental Learning with Pre-train…☆11Nov 12, 2024Updated last year
- 空域明文可逆信息隐藏☆11Jul 6, 2020Updated 5 years ago
- The reproduce of paper "Continual Vision-Language Representation Learning with Off-Diagonal Information ".(Mod-X)☆10Oct 31, 2023Updated 2 years ago
- code for the paper "Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation" (TPAMI 2021)☆10Jul 15, 2022Updated 3 years ago
- Official frontend web application for Moltbook - The Social Network for AI Agents. Built with Next.js 14, TypeScript, Tailwind CSS featur…☆34Feb 1, 2026Updated last month
- Phase-aware Adversarial Defense for Improving Adversarial Robustness☆11Oct 12, 2023Updated 2 years ago
- A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts …☆13Jul 13, 2022Updated 3 years ago
- ☆11Dec 8, 2022Updated 3 years ago
- Code for MICCAI 2021 submission 'Self-Supervised Multi-Modal Alignment For Whole Body Medical Imaging'☆16Sep 22, 2021Updated 4 years ago
- ☆11Jul 11, 2023Updated 2 years ago
- Summaries of ICML 2024 papers☆12Jul 31, 2024Updated last year
- Official implementation of "Relational Proxies: Emergent Relationships as Fine-Grained Discriminators", NeurIPS 2022.☆14Feb 1, 2025Updated last year
- ☆11Feb 9, 2023Updated 3 years ago
- Multi-Label Classification and Class Activation Map on Fashion MNIST☆11Mar 5, 2019Updated 7 years ago
- Image Classification Codebase with PyTorch☆15Sep 10, 2025Updated 5 months ago
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆20May 15, 2025Updated 9 months ago
- [ICLR 2024] Towards Robust Multi-Modal Reasoning via Model Selection☆15Mar 7, 2024Updated last year
- The official implementation of CVPR 2025 paper "Invisible Backdoor Attack against Self-supervised Learning"☆17Jul 5, 2025Updated 8 months ago
- A Generated Face Dataset: AGFD-20K. A Realistic, High-resolution, Vary & Balanced face dataset, generated by stable diffusion.☆11Nov 5, 2023Updated 2 years ago
- [USENIX Security 2022] Mitigating Membership Inference Attacks by Self-Distillation Through a Novel Ensemble Architecture☆16Aug 29, 2022Updated 3 years ago