PDF Extraction Toolkit
☆42Nov 23, 2020Updated 5 years ago
Alternatives and similar repositories for pdfxtk
Users that are interested in pdfxtk are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Java command-line tools for comparing results to ground truth for table location and structure detection as used in the ICDAR 2013 Table …☆33May 31, 2020Updated 5 years ago
- This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified an…☆23Sep 11, 2020Updated 5 years ago
- Structured Data from PDF image-based files☆91Mar 1, 2013Updated 13 years ago
- A project about benchmarking and evaluating existing PDF extraction tools on their semantic abilities to extract the body texts from PDF …☆69Nov 7, 2020Updated 5 years ago
- jpdfbookmarks - fix JPdfBookmarks GUI mode open a pdf have bookmarks include CJK (Chinese , Japanese , Korean ) characters will show like…☆11Sep 4, 2023Updated 2 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Python-based Scraping and parsing toolkit☆12Apr 1, 2023Updated 2 years ago
- Computer Vision Segmentation for Document Layout Analysis☆10Sep 26, 2022Updated 3 years ago
- Command-Line SQLite3 Wrapper for text manipulation with SQL☆16Feb 9, 2017Updated 9 years ago
- Stream based PDF library☆15Aug 20, 2015Updated 10 years ago
- The repository of Icecite, a research paper management system.☆15Mar 29, 2018Updated 8 years ago
- Unofficial Node.js module for the Europeana API. Search and lookup art in various archives across Europe.☆10Nov 30, 2025Updated 3 months ago
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆47Jan 16, 2022Updated 4 years ago
- Library for parsing, transforming and producing PDF files☆25May 5, 2011Updated 14 years ago
- liberate all kinds of data from PDF and other unstructural format and make the information machine-readable and visualizeable for popul…☆31Jun 1, 2018Updated 7 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Adds a guard to disable ObjectInputStream.readObject☆11Dec 6, 2015Updated 10 years ago
- Overridable universal operator overloading for C++14☆20Nov 12, 2014Updated 11 years ago
- A Purescript table renderer capable of displaying multidimensional, heterogeneous JSON data☆14Jan 24, 2018Updated 8 years ago
- Command-line tool for tracking your day-to-day software development work☆18May 8, 2022Updated 3 years ago
- Computer Vision and Deep Learning tutorials for the course Foundation of Digital Humanities☆10Dec 6, 2019Updated 6 years ago
- Web application for easy and convenient viewing of OCR results.☆15Apr 13, 2021Updated 4 years ago
- Python library for working with JSON HAL☆38Aug 29, 2013Updated 12 years ago
- Save your bookmark collection in the Internet Archive, or locally.☆23Jul 5, 2022Updated 3 years ago
- SHERG rule extraction and parsing tools☆24Oct 9, 2015Updated 10 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Repository for the Global Urban Network Dashboard. Contains dash app, assets (style.css, images), and GUN data.☆16Oct 25, 2023Updated 2 years ago
- UniType is a truetype font library for golang☆17Jan 9, 2025Updated last year
- A Python-based interface to the Saleae Logic/Logic16 Device SDK☆19Jul 11, 2014Updated 11 years ago
- Ranking Entity Types using the Web of Data☆30Nov 22, 2016Updated 9 years ago
- High performance OSD prototype☆12Oct 8, 2017Updated 8 years ago
- Unsupervised multilingual sentence segmentation.☆21Feb 26, 2021Updated 5 years ago
- Buildings Unified Data point naming schema for Operation management☆15Mar 31, 2023Updated 2 years ago
- Federated Microblogging Application☆13Sep 7, 2024Updated last year
- DL models that take a document image file as input, locate the position of paragraphs, lines, images, etc. with their labels and confiden…☆26Dec 31, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- A set of Jupyter notebooks capturing an effort to apply Keras to the problem of automatic knowledge base construction.☆11Aug 30, 2016Updated 9 years ago
- lxml parser for gbXML files - create and edit gbXML files with custom Python Element classes☆14Oct 4, 2023Updated 2 years ago
- A collection of things I found useful for doing Machine Learning problem sets.☆25Dec 20, 2018Updated 7 years ago
- High-level Rust library that binds to Poppler to extract text from a PDF☆11Dec 16, 2020Updated 5 years ago
- Gamera 3 for Python 2 (deprecated)☆39Aug 15, 2022Updated 3 years ago
- Evaluation Tool for the ICDAR 2019 Competition on Table Detection and Recognition☆42May 8, 2022Updated 3 years ago
- Makefiles for using EFM8 microcontrollers with SDCC☆11Apr 30, 2018Updated 7 years ago