视觉信息抽取任务中,使用OCR识别结果规范多模态大模型的回答
☆44Dec 31, 2024Updated last year
Alternatives and similar repositories for guidance-ocr
Users that are interested in guidance-ocr are comparing it to the libraries listed below
Sorting:
- 中文文档理解多模态语言模型,支持多模态文档信息抽取,文档embedding☆12Jun 26, 2022Updated 3 years ago
- Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)☆16Apr 23, 2024Updated last year
- ☆19Dec 6, 2023Updated 2 years ago
- ☆21Feb 26, 2024Updated 2 years ago
- Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…☆27Mar 29, 2024Updated last year
- ☆34Dec 18, 2025Updated 2 months ago
- An efficient multi-modal instruction-following data synthesis tool and the official implementation of Oasis https://arxiv.org/abs/2503.08…☆39Jun 4, 2025Updated 9 months ago
- 使用bert进行中文方面级情感识别。☆25Jun 26, 2023Updated 2 years ago
- Using machine learning to detect fake digital images☆27Mar 4, 2018Updated 8 years ago
- ☆27Jul 18, 2023Updated 2 years ago
- [SIGGRAPH Asia 2025] The official implementation of the paper "DvD: Unleashing a Generative Paradigm for Document Dewarping via Coordinat…☆33Nov 22, 2025Updated 3 months ago
- DocBank 文档图像增强数据集,此数据集用于文档图像增强,具体任务包括以下内容:Seal detection & Removal 印章检测 & 移除 ;Watermark detection & Removal 水印检测 & 移除;Document deblurrin…☆44Oct 22, 2024Updated last year
- An NVIDIA Triton Server workflow for OCR and the LayoutLMv3 Transformer Model☆30Sep 14, 2022Updated 3 years ago
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆34Mar 26, 2024Updated last year
- Fast pdf translate是一款pdf翻译软件,基于MinerU实现pdf转markdown的功能,接着对markdown进行分割, 送给大模型翻译,最后组装翻译结果并由pypandoc生成结果pdf。☆41Mar 23, 2025Updated 11 months ago
- ☆55Updated this week
- Backtesting fbprophet prediction of Silver prices for 2017☆14Nov 29, 2017Updated 8 years ago
- 机器学习使用过的API中文版及机器学习的理论知识☆13Jun 8, 2025Updated 9 months ago
- 一个基于多模态大模型的图表解析器☆43Mar 28, 2025Updated 11 months ago
- [ACM TOMM] Official implementation of "TextCoT: Zoom-In for Enhanced Multimodal Text-Rich Image Understanding"☆44Feb 27, 2026Updated last week
- Moiré Pattern Removal for Mobile, Texts/Diagrams on Single-colored Background☆11Feb 7, 2022Updated 4 years ago
- ☆18Feb 16, 2025Updated last year
- ☆11Oct 31, 2024Updated last year
- 作者:qq820629211,1656724967☆11Jan 20, 2020Updated 6 years ago
- 一个用YOLO足球视频分析的任务,检测视频中的人与球。 A task of football video analysis to detect people and balls in the video with YOLO☆12Sep 5, 2020Updated 5 years ago
- deploy machine learning model in tensorflow sering and docker☆10Dec 5, 2018Updated 7 years ago
- Implementation of the TFHE homomorphic encryption scheme.☆12May 14, 2021Updated 4 years ago
- ☆23Dec 11, 2025Updated 2 months ago
- 用Paddle复现论文ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information(ACL2021)☆10Nov 15, 2021Updated 4 years ago
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- chinese wwm masking and ngram masking based on jieba☆11Jul 25, 2019Updated 6 years ago
- Parses a document (scanned or phone captured) and returns the underlying question - answer layout structured capture by LayoutXLM model☆10Jun 14, 2021Updated 4 years ago
- xyb社区公益用途☆15Jun 3, 2025Updated 9 months ago
- 《动手学深度学习》:面向中文读者、能运行、可讨论。英文版即伯克利“深度学习导论(STAT 157)”教材。☆10Jul 27, 2019Updated 6 years ago
- This repository contains the dataset 'DEPTWEET' published in the journal of Computers in Human Behavior.☆12Jul 12, 2023Updated 2 years ago
- Flask Web Interface to deploy ManTraNet and BusterNet for testing image manipulations☆10Jan 24, 2020Updated 6 years ago
- 微信自动发送信息,微信群发消息,Windows系统微信客户端(PC端☆11Dec 14, 2024Updated last year
- Object tracking using pyqt5 and opencv3☆10Feb 23, 2018Updated 8 years ago
- Long Context Research☆29Jan 26, 2026Updated last month