DocTron-hub/OCRVerse

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/DocTron-hub/OCRVerse)

DocTron-hub / OCRVerse

OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models

☆30

Alternatives and similar repositories for OCRVerse

Users that are interested in OCRVerse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

DocTron-hub / FD-RL
View on GitHub
[CVPR 2026] Reading or Reasoning? Format Decoupled Reinforcement Learning for Document OCR
☆17Mar 23, 2026Updated 4 months ago
DocTron-hub / Chart-R1
View on GitHub
Chart-R1: Chain-of-Thought Supervision and Reinforcement for Advanced Chart Reasoner
☆24Aug 7, 2025Updated 11 months ago
UITron-hub / UITron-Speech
View on GitHub
☆21Jan 22, 2026Updated 6 months ago
UITron-hub / UItron
View on GitHub
☆67Sep 6, 2025Updated 10 months ago
DocTron-hub / VinciCoder
View on GitHub
☆42Jan 9, 2026Updated 6 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Yui010206 / MEXA
View on GitHub
[EMNLP 2025 Findings] MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation
☆15Aug 22, 2025Updated 11 months ago
BunnySoCrazy / LA-DocFlatten
View on GitHub
Code and Dataset for our paper: Layout-Aware Single-Image Document Flattening
☆24Dec 16, 2024Updated last year
dbrainio / CyrillicHandwritingPOC
View on GitHub
Repository for contributions for Data Generation for Post-OCR correction of Cyrillic handwriting paper
☆23Nov 27, 2023Updated 2 years ago
tim-lawson / skip-middle
View on GitHub
Learning to Skip the Middle Layers of Transformers
☆17Aug 7, 2025Updated 11 months ago
Rishit-dagli / Squeeze3D
View on GitHub
Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural Compressor
☆23Jun 12, 2025Updated last year
liuzhuang1024 / liuzhuang1024
View on GitHub
You found a secret! lzmisscc/lzmisscc is a ✨special ✨ repository that you can use to add a README.md to your GitHub profile. Make sure it…
☆13Apr 4, 2026Updated 3 months ago
Danielement321 / FM2S
View on GitHub
[MIR] Pytorch Implementation for FM2S, a denoising algorithm for fluorescence microscopy.
☆15Mar 13, 2026Updated 4 months ago
RUCKBReasoning / DPO_Text2SQL
View on GitHub
[ACL 2025] Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQL
☆16Oct 9, 2025Updated 9 months ago
WadeYin9712 / UI-Simulator
View on GitHub
Code for 🌍 UI-Simulator: LLMs as Scalable, General-Purpose Simulators For Evolving Digital Agent Training
☆21Oct 17, 2025Updated 9 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
UARK-AICV / FG-CXR
View on GitHub
The repository of the ACCV 2024 paper "FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Ge…
☆12Jul 28, 2025Updated 11 months ago
Seeing-Fast-and-Slow / Seeing-Fast-and-Slow
View on GitHub
☆16May 28, 2026Updated last month
MozerWang / AMPO
View on GitHub
[ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents
☆51Feb 2, 2026Updated 5 months ago
wang-zidu / BFSM
View on GitHub
The official implementation of BFSM.
☆18Sep 30, 2025Updated 9 months ago
zoryzhang / referential-gaze
View on GitHub
The public reproducible analysis code used for the gaze project
☆11May 16, 2026Updated 2 months ago
naver-ai / lut
View on GitHub
[ECCV 2024] Official PyTorch implementation of LUT "Learning with Unmasked Tokens Drives Stronger Vision Learners"
☆14Dec 1, 2024Updated last year
aiptimizer / fastpdf2png
View on GitHub
☆15Apr 8, 2026Updated 3 months ago
KeyKy / model-zoo
View on GitHub
a model zoo
☆11Jul 19, 2017Updated 9 years ago
LgQu / TIGeR
View on GitHub
Code for paper: Unified Text-to-Image Generation and Retrieval
☆16Jul 19, 2026Updated last week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Gary-code / CADReview
View on GitHub
[ACL 2025 Oral] The official repository of our paper: CADReview: Automatically Reviewing CAD Programs with Error Detection and Correction
☆23Aug 8, 2025Updated 11 months ago
JackLingjie / VisCodex
View on GitHub
Official Implementation for the paper "VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models"
☆23Aug 14, 2025Updated 11 months ago
MCG-NJU / EVAD
View on GitHub
[ICCV 2023] Efficient Video Action Detection with Token Dropout and Context Refinement
☆39Sep 27, 2023Updated 2 years ago
ExplainableML / fomo_in_flux
View on GitHub
Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]
☆62Dec 10, 2024Updated last year
DataArcTech / ChartMoE
View on GitHub
[ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding
☆101Apr 1, 2025Updated last year
yang-ze-kang / AutoMMLab
View on GitHub
☆27Feb 27, 2026Updated 5 months ago
RUC-NLPIR / HiRA
View on GitHub
The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search [SIGIR 2026]
☆65Jul 4, 2025Updated last year
skrantidatta / LIPINC-V2
View on GitHub
☆16Apr 10, 2025Updated last year
TencentCloudADP / youtu-parsing
View on GitHub
Youtu-Parsing: Perception, Structuring and Recognition via High-Parallelism Decoding
☆69Jun 15, 2026Updated last month
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
SWHL / ChineseDocumentPDF
View on GitHub
中文论文、证券类、财报类PDF数据
☆41Jun 13, 2024Updated 2 years ago
shengliu66 / FractionalReason
View on GitHub
Official github repo for "Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute"
☆17Jun 30, 2025Updated last year
astonishedrobo / tabulens
View on GitHub
🔍📃 LLM-powered PDF Table Extractor
☆19Jun 26, 2025Updated last year
Kyyle2114 / Convolutional-Adapter-for-Segment-Anything
View on GitHub
CAD - Memory Efficient Convolutional Adapter for Segment Anything
☆12Oct 4, 2024Updated last year
CGCL-codes / DarkSAM
View on GitHub
The implementation of our NeurIPS 2024 paper "DarkSAM: Fooling Segment Anything Model to Segment Nothing".
☆14Nov 4, 2024Updated last year
lucasjinreal / LLaVA-Magvit2
View on GitHub
LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.
☆38Jun 20, 2024Updated 2 years ago
MinglangQiao / Sports_saliency
View on GitHub
Code for "Saliency Prediction of Sports Videos: A Large-Scale Database and a Self-Adaptive Approach", ICASSP 2024
☆14May 28, 2024Updated 2 years ago