invoice-x/invoice2data

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/invoice-x/invoice2data)

invoice-x / invoice2data

Extract structured data from PDF invoices

☆2,178

Alternatives and similar repositories for invoice2data

Users that are interested in invoice2data are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

naiveHobo / InvoiceNet
View on GitHub
Deep neural network to extract intelligent information from invoice documents.
☆2,690May 3, 2024Updated 2 years ago
robela / OCR-Invoice
View on GitHub
a console application that would run on Windows server to scan user’s Bill and Receipts, which are either captured by camera or in form o…
☆150Aug 15, 2016Updated 9 years ago
invoice-x / invoicex-gui
View on GitHub
Graphical User Interface for factur-x library with basic functionalities
☆24Feb 27, 2019Updated 7 years ago
swsq1134 / INVOICE-PARSER
View on GitHub
For extracting information from invoices and purchase orders
☆21Aug 19, 2020Updated 5 years ago
ReceiptManager / receipt-parser-legacy
View on GitHub
A supermarket receipt parser written in Python using tesseract OCR
☆853Aug 28, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
dhavalpotdar / Graph-Convolution-on-Structured-Documents
View on GitHub
This repo contains code to convert Structured Documents to Graphs and implement a Graph Convolution Neural Network for node classificatio…
☆145Dec 8, 2022Updated 3 years ago
vsymbol / CUTIE
View on GitHub
CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)
☆152Dec 8, 2022Updated 3 years ago
camelot-dev / camelot
View on GitHub
A Python library to extract tabular data from PDFs
☆3,786Updated this week
atlanhq / camelot
View on GitHub
Camelot: PDF Table Extraction for Humans
☆3,716Jan 5, 2023Updated 3 years ago
jsvine / pdfplumber
View on GitHub
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
☆10,575Updated this week
DevashishPrasad / CascadeTabNet
View on GitHub
This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table …
☆1,549Aug 27, 2021Updated 4 years ago
mindee / doctr
View on GitHub
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning. Ongo…
☆6,190Updated this week
WZBSocialScienceCenter / pdftabextract
View on GitHub
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
☆2,255Jun 24, 2022Updated 4 years ago
kainotomo / invoice2erpnext
View on GitHub
Extract data from invoices and import them into your ERPNext site. This app can parse PDF and create purchase orders and invoices in ErpN…
☆35Jun 5, 2026Updated last month
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
camelot-dev / excalibur
View on GitHub
A web interface to extract tabular data from PDFs
☆1,810May 20, 2026Updated 2 months ago
tabulapdf / tabula
View on GitHub
Tabula is a tool for liberating data tables trapped inside PDF files
☆7,446Mar 14, 2025Updated last year
impira / docquery
View on GitHub
An easy way to extract information from documents
☆1,774May 3, 2023Updated 3 years ago
the-paperless-project / paperless
View on GitHub
Scan, index, and archive all of your paper documents
☆7,917Apr 6, 2021Updated 5 years ago
kba / awesome-ocr
View on GitHub
Links to awesome OCR projects
☆3,111Jul 6, 2024Updated 2 years ago
JaidedAI / EasyOCR
View on GitHub
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and …
☆29,800Dec 5, 2025Updated 7 months ago
zzzDavid / ICDAR-2019-SROIE
View on GitHub
ICDAR 2019 Robust Reading Challenge on Scanned Receipts OCR and Information Extraction
☆417Jul 20, 2020Updated 6 years ago
m3nu / invoice2data
View on GitHub
Extract structured data from PDF invoices
☆14Mar 16, 2021Updated 5 years ago
NanoNets / invoice-processing-with-python-nanonets
View on GitHub
Invoice Processing with Python and Nanonets
☆31Nov 27, 2020Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
jcushman / pdfquery
View on GitHub
A fast and friendly PDF scraping library.
☆781Oct 17, 2023Updated 2 years ago
pdfminer / pdfminer.six
View on GitHub
Community maintained fork of pdfminer - we fathom PDF
☆7,002Mar 13, 2026Updated 4 months ago
piyushmathur17 / invoice-extractor
View on GitHub
A python implementation to extract data in structured form from an image of an invoice
☆30Sep 7, 2020Updated 5 years ago
thisisbhavin / graphicalForest
View on GitHub
Using the adjacency matrix and random forest get the Name, Address, Items, Prices, Grand total from all kind of invoices.
☆18Mar 8, 2020Updated 6 years ago
ocrmypdf / OCRmyPDF
View on GitHub
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
☆34,240Updated this week
ExtractTable / ExtractTable-py
View on GitHub
Python library to extract tabular data from images and scanned PDFs
☆285Jul 30, 2024Updated last year
zacharywhitley / awesome-ocr
View on GitHub
☆1,011Jun 29, 2026Updated 3 weeks ago
herobd / Visual-Template-Free-Form-Parsing
View on GitHub
Code for my ICDAR paper "Deep Visual Template-Free Form Parsing"
☆89Jan 14, 2022Updated 4 years ago
Layout-Parser / layout-parser
View on GitHub
A Unified Toolkit for Deep Learning Based Document Image Analysis
☆5,764Aug 15, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
invoice-x / factur-x-ng
View on GitHub
Python lib for Factur-X, the e-invoicing standard for France and Germany
☆40Oct 10, 2018Updated 7 years ago
cseas / ocr-table
View on GitHub
Extract tables from scanned image PDFs using Optical Character Recognition.
☆278Jun 9, 2020Updated 6 years ago
ocropus-archive / DUP-ocropy
View on GitHub
Python-based tools for document analysis and OCR
☆3,466May 22, 2021Updated 5 years ago
bikash / DocumentUnderstanding
View on GitHub
Research papers and code on information extraction from image/pdf
☆97Nov 25, 2022Updated 3 years ago
deepdoctection / deepdoctection
View on GitHub
A Repo For Document AI
☆3,192Jun 20, 2026Updated last month
clovaai / donut
View on GitHub
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
☆6,906Jul 11, 2024Updated 2 years ago
sciencefictionlab / chargrid-pytorch
View on GitHub
Pytorch Implementation of Chargrid Paper (https://arxiv.org/abs/1809.08799)
☆27Mar 11, 2022Updated 4 years ago