pkulium / DeepOCRLinks

☆156

Alternatives and similar repositories for DeepOCR

Users that are interested in DeepOCR are comparing it to the libraries listed below

Sorting:

Tencent / llm.hunyuan.T1
☆85Updated 7 months ago
MiniMax-AI / One-RL-to-See-Them-All
The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning
☆330Updated 5 months ago
VainF / Thinkless
[NeurIPS 2025] Thinkless: LLM Learns When to Think
☆242Updated last month
We-Math / We-Math2.0
The code and data of We-Math 2.0.
☆161Updated 2 months ago
ZihanWang314 / CoE
Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models
☆223Updated 2 weeks ago
MiroMindAI / MiroMind-M1
MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.
☆241Updated 3 months ago
EvolvingLMMs-Lab / multimodal-search-r1
MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…
☆348Updated 2 months ago
si0wang / ThinkLite-VL
☆105Updated 5 months ago
JinjieNi / MegaDLMs
GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 tr…
☆259Updated last week
callsys / GMPO
Geometric-Mean Policy Optimization
☆90Updated last week
Tongyi-Zhiwen / Qwen-Doc
☆299Updated 5 months ago
CSfufu / Revisual-R1
🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal rei…
☆190Updated last month
pangu-tech / pangu-ultra
☆73Updated 5 months ago
yannqi / R-4B
The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"
☆122Updated 2 months ago
MiroMindAI / MiroTrain
MiroTrain is an efficient and algorithm-first framework for post-training large agentic models.
☆93Updated 2 months ago
agents-x-project / PyVision
Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."
☆133Updated 3 months ago
Tencent / digitalhuman
☆171Updated last week
DAMO-NLP-SG / multimodal_textbook
[ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
☆175Updated 8 months ago
rednote-hilab / dots.vlm1
The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.
☆264Updated last month
GAIR-NLP / MAYE
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme
☆145Updated 7 months ago
OPPO-PersonalAI / OAgents
Implementation for OAgents: An Empirical Study of Building Effective Agents
☆282Updated last month
zhaochen0110 / OpenThinkIMG
OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.
☆327Updated 5 months ago
aakaran / reasoning-with-sampling
☆317Updated last week
yihedeng9 / OpenVLThinker
OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement
☆119Updated 3 months ago
InternLM / Intern-S1
A Scientific Multimodal Foundation Model
☆607Updated last month
JT-Ushio / MHA2MLA
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
☆194Updated last month
bigai-nlco / TokenSwift
[ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation
☆118Updated 6 months ago
cnzzx / VSA
Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
☆126Updated last year
MiniMax-AI / SynLogic
[NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
☆186Updated 4 months ago
HITsz-TMG / Awesome-Large-Multimodal-Reasoning-Models
The development and future prospects of large multimodal reasoning models.
☆545Updated 3 months ago