hzauzxb/guidance-ocr

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hzauzxb/guidance-ocr)

hzauzxb / guidance-ocr

视觉信息抽取任务中，使用OCR识别结果规范多模态大模型的回答

☆44

Alternatives and similar repositories for guidance-ocr

Users that are interested in guidance-ocr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Attendfov163com / chinese-layoutlm-v2
View on GitHub
中文文档理解多模态语言模型，支持多模态文档信息抽取，文档embedding
☆12Jun 26, 2022Updated 4 years ago
haoyiq114 / VALOR
View on GitHub
Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)
☆16Apr 23, 2024Updated 2 years ago
varunsaagar / crawlwithagents
View on GitHub
The Web Metadata Extraction Toolkit is designed to streamline the process of extracting, cleaning, and analyzing metadata from websites. …
☆17Jul 8, 2024Updated 2 years ago
Huoyuuu / SearchGPT-Explorer
View on GitHub
Integrates search APIs with GPT models for real-time web access, enabling intelligent Q&A and information retrieval similar to New Bing. …
☆40Jul 11, 2024Updated 2 years ago
jinga-lala / DAMEX
View on GitHub
Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…
☆28Mar 29, 2024Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
tejtw / TEJAPI_Python_Medium_Application
View on GitHub
TEJ_API_Python_實戰應用
☆14Dec 26, 2024Updated last year
qiufengyuyi / lear_ner_extraction
View on GitHub
using lear to do ner extraction
☆29Mar 13, 2022Updated 4 years ago
ant-research / M2-Miner
View on GitHub
[ICLR 2026] M2-Miner: Multi-Agent Enhanced MCTS for Mobile GUI Agent Data Mining
☆55Apr 22, 2026Updated 2 months ago
BlueCrescent / DocLLM
View on GitHub
Implementation of the DocLLM paper for Llama models.
☆13Apr 6, 2025Updated last year
rossumai / docile
View on GitHub
DocILE: Document Information Localization and Extraction Benchmark
☆149Jun 17, 2026Updated last month
openvino-book / PaddleOCR-VL-SFT-for-Japanese-Manga-on-RTX-3060
View on GitHub
Fine-tune PaddleOCR-VL on the Manga109s dataset for Japanese manga text recognition. The base model struggles with vertical Japanese text…
☆15Dec 7, 2025Updated 7 months ago
lplping / few-shot_ner_chinese
View on GitHub
chinese few-shot ner
☆16Aug 28, 2022Updated 3 years ago
bajibabu / GlottGAN
View on GitHub
This repository contains the files used for our Interspeech 2017 paper.
☆16May 30, 2017Updated 9 years ago
osmr / imgret
View on GitHub
Sandbox for image retrieval models and algorithms
☆20Dec 19, 2018Updated 7 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
sugarforever / langchain-serve-example
View on GitHub
☆12Oct 12, 2023Updated 2 years ago
WangJingyao07 / Embodied-AI-Papers-with-Code
View on GitHub
🎉🎨 This repository contains a reading list of papers on Embodied AI, including LLM/MLLM/VLA.
☆13Aug 18, 2025Updated 11 months ago
angellyao / formulaandcode
View on GitHub
鲁伟《机器学习公式推导与代码实现》。整体对算法的分类是亮点。算法原理和代码实现也相对简单，可以和《机器学习实战》对比起来看。
☆10Oct 19, 2022Updated 3 years ago
MattShannon / HTS-demo_CMU-ARCTIC-SLT-STRAIGHT-AR-decision-tree
View on GitHub
Autoregressive HMM version of the HTS demo for statistical speech synthesis (includes autoregressive clustering)
☆16Sep 12, 2014Updated 11 years ago
llorz / SIG25_goldschmiedrisse
View on GitHub
code for the SIGGRAPH 2025 paper "Computational Modeling of Gothic Microarchitecture"
☆17Apr 25, 2025Updated last year
dmoonat / Named-Entity-Recognition
View on GitHub
☆16Jun 19, 2022Updated 4 years ago
guenthermi / the-movie-database-import
View on GitHub
Script to import data from the The Movie Database to PostgreSQL (Dataset URL: https://www.kaggle.com/rounakbanik/the-movies-dataset
☆11Mar 20, 2020Updated 6 years ago
YourHealer / DM-Wine-Quality-Analysis
View on GitHub
数据挖掘-葡萄酒质量分析
☆16Jan 17, 2023Updated 3 years ago
zcsxll / date_trans_with_transformer
View on GitHub
a pytorch implementation of machine translation model(transformer) that translates human readable dates ("25th of June, 2009") into machi…
☆12Sep 28, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
heimy2000 / CMAT
View on GitHub
☆21Feb 26, 2024Updated 2 years ago
yongzhuo / layoutlmv3-layoutxlm-chinese
View on GitHub
chinese document classification of layoutlmv3 and layoutxlm
☆45Oct 25, 2022Updated 3 years ago
ethan-funny / explore-llm-agents
View on GitHub
Dive into LLM Agents
☆18Jun 1, 2024Updated 2 years ago
kkdai / linebot-gemma
View on GitHub
A LINE Bot demo showcasing how to use a local LLM (Gemma) via Groq to modify personal information and detect the need for LLM assistance.
☆17Jul 25, 2024Updated last year
1001WillsStudio / AuroraCoder
View on GitHub
An autonomous AI coding agent with novel innovations in tool state management and code editing, running in a Docker sandbox with a persis…
☆18Updated this week
eliemichel / WebGPU-AutoLayout
View on GitHub
An online utility tool to generate C++ boilerplate binding code by parsing WGSL.
☆13Aug 11, 2024Updated last year
bajibabu / postfilt_gan
View on GitHub
This is an implementation of "Generative adversarial network-based postfilter for statistical parametric speech synthesis"
☆16Jun 27, 2018Updated 8 years ago
ictnlp / LSG
View on GitHub
The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”
☆15Jan 3, 2025Updated last year
weitsung50110 / Huggingface_Langchain_kit
View on GitHub
☆15Aug 4, 2025Updated 11 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
srang992 / Ollama-Chatbot
View on GitHub
☆18Oct 20, 2023Updated 2 years ago
ADT109119 / ChatPDF-LineBot
View on GitHub
使用LnagChain+FastAPI+Vue，搭建一個可以上傳並讀取PDF回答問題的LineBot。
☆17May 19, 2026Updated 2 months ago
caipeng328 / ForCenNet
View on GitHub
☆81Jul 31, 2025Updated 11 months ago
Yuliang-Liu / MultimodalOCR
View on GitHub
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
☆870Updated this week
ygfrancois / crnn.pytorch.tensorrt.chinese
View on GitHub
A Chinese characters recognition repository with tensorrt format supported based on CRNN_Chinese_Characters_Rec and TensorRTx.
☆18Mar 11, 2021Updated 5 years ago
jakjus / gradiologin
View on GitHub
OAuth Login for Gradio. Supports multiple identity providers.
☆16Jul 6, 2026Updated 2 weeks ago
27182812 / ChineseBERT_paddle
View on GitHub
用Paddle复现论文ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information（ACL2021）
☆10Nov 15, 2021Updated 4 years ago