AndresPMD/StacMR

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AndresPMD/StacMR)

AndresPMD / StacMR

Scene Text Aware Cross Modal Retrieval (StacMR)

☆24

Alternatives and similar repositories for StacMR

Users that are interested in StacMR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AndresPMD / Pytorch-yolo-phoc
View on GitHub
Implementation on pytorch of the code from the ECCV 2018 paper - Single Shot Scene Text Retrieval
☆13Dec 15, 2021Updated 4 years ago
AndresPMD / semantic_adaptive_margin
View on GitHub
WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
☆16Dec 10, 2021Updated 4 years ago
AndresPMD / Fine_Grained_Clf
View on GitHub
Based on the WACV 2020 paper - Fine Grained Classification and Retrieval by Combining Visual and Locally Pooled Textual Features
☆25Nov 15, 2021Updated 4 years ago
AndresPMD / GCN_classification
View on GitHub
Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval
☆65Dec 1, 2022Updated 3 years ago
AndresPMD / Clip_CMR
View on GitHub
CLIP-based simple image-text matching baseline for COCO and F30K
☆15Sep 16, 2021Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
furkanbiten / stvqa_amazon_ocr
View on GitHub
STVQA and TextVQA OCR results from Amazon Text in Image pipeline
☆12Jul 18, 2022Updated 4 years ago
MCLAB-OCR / KnowledgeMiningWithSceneText
View on GitHub
☆38Feb 4, 2023Updated 3 years ago
furkanbiten / object-bias
View on GitHub
Let there be clock in the beach - WACV 2022
☆15Nov 15, 2021Updated 4 years ago
lanfeng4659 / STR-TDSL
View on GitHub
☆82Jun 29, 2023Updated 3 years ago
furkanbiten / SelectiveTextStyleTransfer
View on GitHub
ICDAR 2019
☆25Aug 2, 2019Updated 6 years ago
biswassanket / DocSegTr
View on GitHub
A Bottom-Up Instance Segmentation Strategy for segmenting document instances using Transformers
☆59Sep 9, 2024Updated last year
guanghuixu / AnchorCaptioner
View on GitHub
☆30May 7, 2021Updated 5 years ago
furkanbiten / GoodNews
View on GitHub
Good News Everyone! - CVPR 2019
☆130Apr 14, 2022Updated 4 years ago
ayanban011 / SVGCraft
View on GitHub
[WACV 2026 Round 1] Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout
☆24Oct 11, 2025Updated 9 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
lluisgomez / single-shot-str
View on GitHub
Single Shot Scene Text Retrieval, ECCV 2018. L. Gomez*, A. Mafla*, M. Rusiñol, D. Karatzas.
☆68May 13, 2019Updated 7 years ago
dali92002 / DocEnTR
View on GitHub
DocEnTr: An end-to-end document image enhancement transformer - ICPR 2022
☆190Jan 17, 2025Updated last year
PKU-ICST-MIPL / SSDH_TCSVT2017
View on GitHub
Source code of our TCSVT 2017 paper "SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval"
☆15May 29, 2019Updated 7 years ago
amazon-science / textadain-robust-recognition
View on GitHub
TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers
☆21Jul 26, 2022Updated 3 years ago
furkanbiten / idl_data
View on GitHub
OCR Annotations from Amazon Textract for Industry Documents Library
☆103Aug 20, 2022Updated 3 years ago
dali92002 / DE-GAN
View on GitHub
Document Image Enhancement with GANs - TPAMI journal
☆222Mar 24, 2023Updated 3 years ago
ChenyuGAO-CS / SMA
View on GitHub
The imdb files with SBD-Trans OCR for TextVQA dataset.
☆11Nov 30, 2021Updated 4 years ago
biswassanket / synth_doc_generation
View on GitHub
Official PyTorch Implementation of DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis - ICDAR 2021
☆93Jul 16, 2021Updated 5 years ago
LgQu / CAMERA
View on GitHub
Context-Aware Multi-View Summarization Network for Image-Text Matching. ACM MM'20
☆29May 26, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Pay20Y / PIMNet
View on GitHub
☆16Jan 30, 2022Updated 4 years ago
ezosa / M3L-topic-model
View on GitHub
Multimodal and multilingual topic model with pretrained embeddings
☆12Apr 11, 2023Updated 3 years ago
weijiawu / Polygon-free-Unconstrained-Scene-Text-Detection-with-Box-Annotations
View on GitHub
Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training
☆34Nov 24, 2022Updated 3 years ago
Cuberick-Orion / CIRPLANT
View on GitHub
Official implementation of the Composed Image Retrieval using Pretrained LANguage Transformers (CIRPLANT) | ICCV 2021 - Image Retrieval o…
☆40Jun 26, 2024Updated 2 years ago
CrossmodalGroup / GSMN
View on GitHub
Implementation of our CVPR2020 paper, Graph Structured Network for Image-Text Matching
☆170Oct 12, 2020Updated 5 years ago
sounakdey / SigNet
View on GitHub
SigNet: Convolutional Siamese Network for Writer Independent Offline Signature Verification
☆80Oct 24, 2017Updated 8 years ago
Xiaomeng-Yang / STR_benchmark_cleansed
View on GitHub
☆14May 26, 2023Updated 3 years ago
ZihaoWang-CV / CAMP_iccv19
View on GitHub
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
☆127Feb 26, 2020Updated 6 years ago
clin1223 / MTVM
View on GitHub
[ECCV 2022] Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation
☆19Jul 18, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ronghanghu / mmf
View on GitHub
A modular framework for Visual Question Answering research by the FAIR A-STAR team
☆45Aug 26, 2021Updated 4 years ago
Roc-Ng / HANet
View on GitHub
PyTorch implementation of HANet: Hierarchical Alignment Networks for Video-Text Retrieval (ACM MM 2021).
☆47Aug 19, 2021Updated 4 years ago
CrossmodalGroup / CMCAN
View on GitHub
Implementation of our AAAI2022 paper, Show Your Faith: Cross-Modal Confidence-Aware Network for Image-Text Matching.
☆36Jun 16, 2023Updated 3 years ago
evanmiltenburg / MeasureDiversity
View on GitHub
Measure the diversity of image descriptions, repository for our COLING 2018 paper.
☆13Dec 29, 2019Updated 6 years ago
lluisgomez / TextTopicNet
View on GitHub
Self-supervised learning of visual features through embedding images into text topic spaces
☆95Aug 20, 2022Updated 3 years ago
uakarsh / latr
View on GitHub
Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answer…
☆56Oct 30, 2024Updated last year
weijiawu / BOVText-Benchmark
View on GitHub
[NeurIPS2021] BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting
☆71Oct 9, 2023Updated 2 years ago