Let there be clock in the beach - WACV 2022
☆15Nov 15, 2021Updated 4 years ago
Alternatives and similar repositories for object-bias
Users that are interested in object-bias are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching☆16Dec 10, 2021Updated 4 years ago
- Hadwritten Text Recognition in Few-shot Scenario☆22Mar 25, 2023Updated 3 years ago
- Optocal Character Recognition (OCR / HTR) using Transformers☆11Aug 20, 2022Updated 3 years ago
- Based on the WACV 2020 paper - Fine Grained Classification and Retrieval by Combining Visual and Locally Pooled Textual Features☆25Nov 15, 2021Updated 4 years ago
- [ICDAR 2024] (Best Student Paper🏆) Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation☆15Sep 6, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- A Bottom-Up Instance Segmentation Strategy for segmenting document instances using Transformers☆59Sep 9, 2024Updated last year
- [WACV 2026 Round 1] Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout☆23Oct 11, 2025Updated 6 months ago
- Implementation on pytorch of the code from the ECCV 2018 paper - Single Shot Scene Text Retrieval☆13Dec 15, 2021Updated 4 years ago
- ☆16Jun 14, 2024Updated last year
- OCR Annotations from Amazon Textract for Industry Documents Library☆103Aug 20, 2022Updated 3 years ago
- STVQA and TextVQA OCR results from Amazon Text in Image pipeline☆12Jul 18, 2022Updated 3 years ago
- Official PyTorch Implementation of DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis - ICDAR 2021☆93Jul 16, 2021Updated 4 years ago
- Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval☆65Dec 1, 2022Updated 3 years ago
- The official repo of the Comics Survey: "A missing piece in Vision and Language: A Survey on Comics Understanding"☆135Jan 2, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Scene Text Aware Cross Modal Retrieval (StacMR)☆24Sep 3, 2021Updated 4 years ago
- Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.☆137Oct 18, 2025Updated 6 months ago
- Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023☆30Jul 12, 2023Updated 2 years ago
- Measure the diversity of image descriptions, repository for our COLING 2018 paper.☆13Dec 29, 2019Updated 6 years ago
- [WACV 2025] Code for Enhancing Vision-Language Few-Shot Adaptation with Negative Learning☆11Feb 24, 2025Updated last year
- The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoi…☆24Apr 16, 2023Updated 3 years ago
- Look and Modify: Modification Networks for Image Captioning, BMVC 2019☆21Feb 18, 2020Updated 6 years ago
- ☆38Feb 4, 2023Updated 3 years ago
- CLIP-based simple image-text matching baseline for COCO and F30K☆14Sep 16, 2021Updated 4 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers☆21Jul 26, 2022Updated 3 years ago
- DocEnTr: An end-to-end document image enhancement transformer - ICPR 2022☆187Jan 17, 2025Updated last year
- ☆10Jan 28, 2016Updated 10 years ago
- Code for Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models☆25Oct 29, 2024Updated last year
- [ICLR 2025] Code for Self-Correcting Decoding with Generative Feedback for Mitigating Hallucinations in Large Vision-Language Models☆25Apr 14, 2025Updated last year
- This repository contains the source code, models and data files for the work titled: "Unsupervised Image Style Embeddings for Retrieval a…☆13May 29, 2021Updated 4 years ago
- [AAAI 2021] Confidence-aware Non-repetitive Multimodal Transformers for TextCaps☆24Mar 29, 2023Updated 3 years ago
- Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)☆37May 16, 2022Updated 3 years ago
- Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answer…☆56Oct 30, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Ranking-Consistent Language-Image Pretraining☆13Oct 24, 2025Updated 6 months ago
- Deep Learning - Visual Representation Learning by solving Jigsaw puzzles using Deep Reinforcement Learning☆10Dec 8, 2016Updated 9 years ago
- [WACV 2025] Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection☆17Mar 23, 2025Updated last year
- ☆24Dec 22, 2016Updated 9 years ago
- ☆10Oct 18, 2021Updated 4 years ago
- Fill the boring catsalud covid vaccine form with a console command☆16Jul 17, 2021Updated 4 years ago
- Multi-Pedestrian Tracking in Monocular Calibrated Cameras☆11Feb 21, 2014Updated 12 years ago