NiceRingNode/Awesome-Generative-Models-for-OCR

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NiceRingNode/Awesome-Generative-Models-for-OCR)

NiceRingNode / Awesome-Generative-Models-for-OCR

[arXiv 25] OCRGenBench: A Comprehensive Benchmark for Evaluating OCR Generative Capabilities

☆273

Alternatives and similar repositories for Awesome-Generative-Models-for-OCR

Users that are interested in Awesome-Generative-Models-for-OCR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NiceRingNode / LGGPT
View on GitHub
[IJCV 2025] Smaller But Better: Unifying Layout Generation with Smaller Large Language Models
☆158Aug 3, 2025Updated 11 months ago
SCUT-DLVCLab / DOLPHIN
View on GitHub
[IEEE TIFS 2024] Online Writer Retrieval with Chinese Handwritten Phrases: A Synergistic Temporal-Frequency Representation Learning Appro…
☆57Aug 3, 2025Updated 11 months ago
HCIILAB / MSDS
View on GitHub
[NeurIPS 2022 Spotlight] The official GitHub page of "MSDS: A Large-Scale Chinese Signature and Token Digit String Dataset for Handwritin…
☆95Jul 17, 2026Updated last week
jailflip / jailflip-2025
View on GitHub
☆22Jan 9, 2026Updated 6 months ago
SCUT-DLVCLab / OCR-Reasoning
View on GitHub
[ICLR 2026] OCR-Reasoning Benchmark: Unveiling the True Capabilities of MLLMs in Complex Text-Rich Image Reasoning
☆76May 26, 2026Updated last month
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
ModelTC / HarmoniCa
View on GitHub
[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching i…
☆45Jul 10, 2025Updated last year
lcy0604 / QT-TextSR
View on GitHub
This repository is the implementation of "QT-TextSR: Enhancing scene text image super-resolution via efficient interaction with text reco…
☆20Jul 9, 2025Updated last year
GiantAILab / DeepSound-V1
View on GitHub
Official code for DeepSound-V1
☆12May 14, 2025Updated last year
ltlhuuu / PSEC
View on GitHub
[ICLR 2025] The offical implementation of "PSEC: Skill Expansion and Composition in Parameter Space", a new framework designed to facilit…
☆65Feb 12, 2025Updated last year
Susan571 / LENSLLM
View on GitHub
This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉
☆26May 29, 2025Updated last year
GiantAILab / DeepDubber-V1
View on GitHub
DeepDubber-V1: Towards High Quality and Dialogue, Narration, Monologue Adaptive Movie Dubbing Via Multi-Modal Chain-of-Thoughts Reasoning…
☆30Sep 7, 2025Updated 10 months ago
GiantAILab / Video-to-Audio-and-Piano
View on GitHub
☆18May 14, 2025Updated last year
lichengliu03 / unary-feedback
View on GitHub
☆44Mar 31, 2026Updated 3 months ago
shi-yx / URaG
View on GitHub
Official implementation of URaG: Unified Retrieval and Generation in Multimodal LLMs for Efficient Long Document Understanding (AAAI 2026…
☆43Feb 4, 2026Updated 5 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
lcy0604 / CTRNet-plus
View on GitHub
The official implement of CTRNet++.
☆15Dec 30, 2024Updated last year
SCUT-DLVCLab / AutoHDR
View on GitHub
[ACL 2025 main] The official GitHub page of "Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restorati…
☆61Jun 28, 2026Updated 3 weeks ago
CUHK-Shenzhen-SE / UTBoost
View on GitHub
[ACL'25] UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench
☆36Aug 12, 2025Updated 11 months ago
JaydenLyh / SmPO
View on GitHub
[ICML 2025] Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
☆30Jun 29, 2025Updated last year
NiceRingNode / PartialConvolution
View on GitHub
A non-official re-implementation of article "[ECCV 18] Image Inpainting for Irregular Holes Using Partial Convolutions"
☆12Mar 1, 2025Updated last year
SCUT-DLVCLab / RFUND
View on GitHub
[MM'2024] Official release of RFUND introduced in the MM'2024 paper "PEneo: Unifying Line Extraction, Line Grouping, and Entity Linking f…
☆21Dec 4, 2024Updated last year
inFaaa / Awesome-Personalized-Video-Creation
View on GitHub
📖 This is a repository for organizing papers, codes, and other resources related to personalized video generation and editing.
☆64Dec 9, 2025Updated 7 months ago
yeungchenwa / HDR
View on GitHub
[AAAI2025 Oral] Predicting the Original Appearance of Damaged Historical Documents
☆111Jun 28, 2026Updated 3 weeks ago
taco-group / LangCoop
View on GitHub
🏆 Official implementation of LangCoop: Collaborative Driving with Natural Language
☆81Sep 12, 2025Updated 10 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
BIT-DA / ABS
View on GitHub
[ICML2025] Official Code of From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection
☆27Jun 27, 2025Updated last year
RylonW / DocNLC
View on GitHub
Official code for DocNLC: A Document Image Enhancement Framework with Normalized and Latent Contrastive Representation for Multiple Degra…
☆44Mar 20, 2026Updated 4 months ago
Tim-Siu / reinforcement-distillation
View on GitHub
Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"
☆33Jul 25, 2025Updated 11 months ago
liuhanze623 / AdaReNet
View on GitHub
Official implementation.
☆30Jul 1, 2025Updated last year
PrismaX-Team / PhysUniBenchmark
View on GitHub
☆20Nov 27, 2025Updated 7 months ago
shannanyinxiang / ViTEraser
View on GitHub
Official implementation of ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining (AAAI 20…
☆66Jul 4, 2024Updated 2 years ago
ZhishanQ / UniHGKR
View on GitHub
The official repository of UniHGKR: Unified Instruction-aware Heterogeneous Knowledge Retrievers
☆27Jun 12, 2025Updated last year
ZZZHANG-jx / DocRes
View on GitHub
[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
☆628Aug 3, 2025Updated 11 months ago
DataArcTech / SQL-R1
View on GitHub
[NeurIPS'25] Official Repository for the Paper "SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning"
☆145Nov 20, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
fscdc / ReasonMap
View on GitHub
[CVPR 2026] ReasonMap: Towards Fine-Grained Visual Reasoning from Transit Maps
☆86Feb 22, 2026Updated 5 months ago
Qiukunpeng / Siamese-Diffusion
View on GitHub
[CVPR 2025] Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation
☆90Nov 29, 2025Updated 7 months ago
waltonfuture / MM-UPT
View on GitHub
[NeurIPS 2025] First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training
☆88Oct 29, 2025Updated 8 months ago
zhirui-gao / Curve-Gaussian
View on GitHub
[ICCV 2025] Official PyTorch Implementation of "Curve-Aware Gaussian Splatting for 3D Parametric Curve Reconstruction""
☆59Sep 5, 2025Updated 10 months ago
SCUT-DLVCLab / MegaHan97K
View on GitHub
[PR 2025] The official GitHub page of "MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Ca…
☆84May 18, 2026Updated 2 months ago
caipeng328 / ForCenNet
View on GitHub
☆81Jul 31, 2025Updated 11 months ago
shannanyinxiang / SPTS
View on GitHub
Official implementation of SPTS: Single-Point Text Spotting (ACM MM 2022 Oral)
☆145Jul 26, 2023Updated 2 years ago