marieai / marie-aiLinks
Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pipelines (GenAI, LLM, VLLM) into your applications, supporting various tasks such as document cleanup, optical character recognition (OCR), classification, splitting, named entity recognition, and form processi…
☆70Updated this week
Alternatives and similar repositories for marie-ai
Users that are interested in marie-ai are comparing it to the libraries listed below
Sorting:
- DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning mod…☆20Updated 2 years ago
- ☆22Updated last year
- ReadingBank: A Benchmark Dataset for Reading Order Detection☆105Updated 9 months ago
- Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provi…☆37Updated 3 months ago
- Table detection (TD) and table structure recognition (TSR) using Yolov5/Yolov8, and you can get the same (even better) result compared wi…☆46Updated 11 months ago
- TableNet: Deep Learning model for end-to-end Table Detection and Tabular data extraction from Scanned Data Images In modern times, more a…☆58Updated 3 years ago
- Repository for deepdoctection tutorial notebooks☆45Updated this week
- Tools for extract figure, table, text, .. from a pdf document.☆32Updated 4 years ago
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆36Updated last year
- Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.☆123Updated 2 years ago
- ☆40Updated 4 years ago
- ☆80Updated 3 years ago
- Code for ICPR2022 paper: "Graph Neural Networks and Representation Embedding for table extraction in PDF Documents"☆35Updated last year
- Run tesseract with the tesserocr bindings with @OCR-D's interfaces☆39Updated last month
- Google Colab Demo of CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents☆47Updated 3 years ago
- Logical structure analysis for visually structured documents☆90Updated 2 years ago
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions☆44Updated last year
- DFKI Layout Detection for OCR-D☆47Updated last month
- 阅读顺序、Layoutreader☆15Updated last month
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GRO…☆50Updated 3 months ago
- transformer based OCR framework used to train OCR or image to latex☆9Updated 2 years ago
- Data extraction with Donut ML model☆57Updated 10 months ago
- An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"☆78Updated last year
- LitePali is a minimal, efficient implementation of ColPali for image retrieval and indexing, optimized for cloud deployment.☆52Updated 8 months ago
- My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"☆73Updated 2 months ago
- A line-based framework to detect and extract tabular data in JSON format from raster images using computer vision and Tesseract OCR.☆57Updated last year
- Dense Article Dataset (DAD): A Benchmark Dataset for Document Layout Analysis☆16Updated 3 years ago
- Object Detection Model for Scanned Documents☆93Updated 3 months ago
- YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis☆115Updated 3 months ago
- 📃 A contracts clause summarization system using LLM and vector database☆17Updated 4 months ago