MosRat / got.cpp
Using Llam.cpp and onnxruntime to accelerate inference of GOT-OCR2.0
☆14Updated 3 weeks ago
Alternatives and similar repositories for got.cpp:
Users that are interested in got.cpp are comparing it to the libraries listed below
- 研究GOT-OCR-项目落地加速,不限语言☆59Updated 5 months ago
- 用于学习GOT/Qwen/OnnxLLm☆49Updated 5 months ago
- Vary-tiny codebase upon LAVIS (for training from scratch)and a PDF image-text pairs data (about 600k including English/Chinese)☆79Updated 6 months ago
- A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.☆171Updated last month
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆141Updated 10 months ago
- Datasets and Evaluation Scripts for CompHRDoc☆35Updated last month
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆201Updated 10 months ago
- ☆56Updated last year
- ☆82Updated 3 months ago
- 阅读顺序、Layoutreader☆11Updated 10 months ago
- 中文论文、证券类、财报类PDF数据☆25Updated 9 months ago
- My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"☆73Updated 2 months ago
- Table Structure Recognition☆17Updated 8 months ago
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions☆42Updated last year
- ☆26Updated 5 months ago
- ☆25Updated last month
- YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis☆96Updated 3 weeks ago
- This repo is used to release the ArxivFormula dataset.☆24Updated 4 months ago
- Accelerating GOT-OCRv2 with VLLM☆9Updated 4 months ago
- qwen2 and llama3 cpp implementation☆43Updated 9 months ago
- A PyTorch implementation of DTrOCR: Decoder-only Transformer for Optical Character Recognition☆149Updated last week
- Codebase for fine-tuning / evaluating nougat-based image2latex generation models☆146Updated 6 months ago
- [MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.☆28Updated 3 months ago
- [TAI 2023] Appearance Enhancement for Camera-captured Document Images in the Wild☆35Updated last year
- Table Structure Recognition☆69Updated 2 years ago
- [NAACL 2024] Visually Guided Generative Text-Layout Pre-training for Document Intelligence☆142Updated 6 months ago
- Our 2nd-gen LMM☆33Updated 10 months ago
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆37Updated 6 months ago
- 基于TrOCR + UniMER-1M数据集,训练一个小而美的公式识别模型☆21Updated 4 months ago
- MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.☆21Updated 3 months ago