[arXiv: 2505.12307] LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images?
☆35Dec 1, 2025Updated 3 months ago
Alternatives and similar repositories for LogicOCR
Users that are interested in LogicOCR are comparing it to the libraries listed below
Sorting:
- ☆17Jul 24, 2025Updated 7 months ago
- Official repo for "AnesSuite: A Comprehensive Benchmark and Dataset Suite for Anesthesiology Reasoning in LLMs"☆22Jan 18, 2026Updated last month
- [NeurIPS'24] GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching☆28May 29, 2025Updated 9 months ago
- Official repo for "S5: Scalable Semi-Supervised Semantic Segmentation in Remote Sensing"☆33Dec 4, 2025Updated 2 months ago
- This is the pytorch implementation of FCL-Net, accepted by NN'2022.☆14May 25, 2022Updated 3 years ago
- [EMNLP22] Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models☆22Mar 27, 2023Updated 2 years ago
- The official repo for the technical report "Scalable Mask Annotation for Video Text Spotting"☆16May 3, 2023Updated 2 years ago
- ☆22May 30, 2023Updated 2 years ago
- Source code of COLING 2022 paper "A Contrastive Cross-channel Data Augmentation Framework for Aspect-based Sentiment Analysis"☆22Feb 18, 2023Updated 3 years ago
- [ISBI 2025] XLSTM-HVED: Cross-Modal Brain Tumor Segmentation and MRI Reconstruction Method Using Vision XLSTM and Heteromodal Variational…☆18Jul 9, 2025Updated 7 months ago
- ☆12Apr 26, 2024Updated last year
- Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis☆17Mar 27, 2023Updated 2 years ago
- The dataset used in the CVPR 2022 paper (SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware Norm…☆34Jun 21, 2022Updated 3 years ago
- (CVPR 2024) Bridging the Gap Between End-to-End and Two-Step Text Spotting.☆71Jun 11, 2024Updated last year
- Official repo for "TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series"☆28May 14, 2025Updated 9 months ago
- [MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking f…☆20Dec 4, 2024Updated last year
- VimTS: A Unified Video and Image Text Spotter☆78Nov 10, 2024Updated last year
- ☆21Dec 12, 2025Updated 2 months ago
- -☆23Oct 25, 2022Updated 3 years ago
- A comprehensive list [Hi-SAM@TPAMI'24, GoMatching@NeurIPS'24, DeepSolo(++)@ CVPR'23, DPText-DETR@AAAI'23, I3CL@IJCV'22] of our research w…☆93Nov 12, 2024Updated last year
- This repository is the implementation of "Don't Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Contex…☆96Feb 21, 2023Updated 3 years ago
- [ICLR 2026] OCR-Reasoning Benchmark: Unveiling the True Capabilities of MLLMs in Complex Text-Rich Image Reasoning☆73Dec 17, 2025Updated 2 months ago
- [TKDE] Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis☆50Apr 4, 2024Updated last year
- Official implementation of UPOCR: Towards unified pixel-level OCR interface (ICML 2024)☆67Jun 6, 2024Updated last year
- [ACL 2025 main] The official GitHub page of "Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restorati…☆54Dec 22, 2025Updated 2 months ago
- ☆61Oct 21, 2022Updated 3 years ago
- ☆18Sep 23, 2025Updated 5 months ago
- [IEEE TPAMI] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation☆347May 30, 2025Updated 9 months ago
- Code for CVPR21 paper A Multiplexed Network for End-to-End, Multilingual OCR☆80Dec 2, 2022Updated 3 years ago
- [TIP2025] The implementation of "Uncertainty Guided Refinement for Fine-grained Salient Object Detection"☆15Apr 20, 2025Updated 10 months ago
- Scalable DBSCAN and OPTICS for clustering high-dimensional datasets using random projections☆13Nov 1, 2024Updated last year
- Official implementation of SPTS: Single-Point Text Spotting (ACM MM 2022 Oral)☆144Jul 26, 2023Updated 2 years ago
- Code repository supporting the paper "Auto-Generating Weak Labels for Real & Synthetic Data to Improve Label-Scarce Medical Image Segment…☆11Apr 29, 2024Updated last year
- [IROS 2025] EgoLoc: Zero-Shot Temporal Interaction Localization for Egocentric Videos☆32Jan 13, 2026Updated last month
- A semi print-in-place hand for human-like manipulation, designed to be built by anyone.☆17Jan 5, 2026Updated last month
- This is the official repository for the paper "Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction". ICCV …☆24Dec 4, 2025Updated 2 months ago
- ☆31Updated this week
- The official repo for [CVPR'23] "DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting" & [ArXiv'23] "DeepSolo++:…☆282May 30, 2025Updated 9 months ago
- Analyse and Design Deep Neural Network, Dr.Kalhor, University of Tehran☆11Feb 18, 2024Updated 2 years ago