PDF to XML ALTO file converter
☆268Feb 11, 2026Updated last month
Alternatives and similar repositories for pdfalto
Users that are interested in pdfalto are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- pdf2xml convertor based on Xpdf library - modified version☆27Feb 23, 2018Updated 8 years ago
- A browser extension providing Open Access bibliographical services☆18Dec 9, 2022Updated 3 years ago
- A high performance bibliographic information service: https://biblio-glutton.readthedocs.io☆148Mar 6, 2026Updated 3 weeks ago
- Convert between Tesseract hOCR and ALTO XML using XSL stylesheets☆59Mar 20, 2026Updated last week
- Open Access PDF harvester☆42May 3, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Softcite software mention recognizer, finding mentions and citations to software from within the academic literature☆82Sep 30, 2025Updated 5 months ago
- Service for converting and enhancing heterogeneous publisher XML formats into TEI☆62Sep 14, 2024Updated last year
- A machine learning software for extracting information from scholarly documents☆4,743Updated this week
- Some examples of usage of Grobid in a third party java project.☆20Jun 14, 2023Updated 2 years ago
- Conversions between various OCR formats☆84Feb 13, 2026Updated last month
- A machine learning tool for fishing entities☆269Feb 27, 2026Updated last month
- Python client for GROBID Web services☆394Mar 5, 2026Updated 3 weeks ago
- GROBID extension for identifying and normalizing physical quantities.☆83Jun 15, 2025Updated 9 months ago
- a Deep Learning Framework for Text https://delft.readthedocs.io/☆414Updated this week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Finding mentions and citations to named and implicit research datasets from within the academic literature☆30Jun 14, 2025Updated 9 months ago
- ☆33Nov 16, 2022Updated 3 years ago
- A Named-Entity Recogniser based on Grobid.☆54May 14, 2025Updated 10 months ago
- ALTO XML schema - latest and all former versions☆55Jan 20, 2026Updated 2 months ago
- Java command line tool to convert PAGE XML files with layout and text content to PDF☆10Apr 27, 2020Updated 5 years ago
- Poor man's simple harvester for arXiv resources☆13Jul 14, 2023Updated 2 years ago
- Bouton ISTEX : extension web capable d'insérer dynamiquement sur la page web consultée un lien vers le fulltext d'un document si ce dern…☆11May 30, 2023Updated 2 years ago
- wrapper for the crossref events api☆23May 23, 2023Updated 2 years ago
- Python tools for performing various operations on ALTO XML files☆49Feb 27, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Open Access PDF harvester, metadata aggregator and full-text ingester☆62May 3, 2024Updated last year
- Document Layout Analysis resources repos for development with PdfPig.☆633Oct 1, 2023Updated 2 years ago
- Science-parse version 2☆255Nov 20, 2019Updated 6 years ago
- Knowledge Base stuff☆23Mar 1, 2026Updated 3 weeks ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆201May 21, 2025Updated 10 months ago
- A Knowledge Base for research software relying on large-scale text mining and curated knowledge sources☆17May 14, 2023Updated 2 years ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Mar 8, 2022Updated 4 years ago
- Material parsers and other tools, scripts Initially developed for Grobid Superconductor☆13Feb 21, 2025Updated last year
- XSLT for converting TEI MsDescription to IIIF manifests☆13Oct 18, 2016Updated 9 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Scientific articles using or citing Common Crawl data☆28Mar 19, 2026Updated last week
- ☆13Sep 4, 2015Updated 10 years ago
- Content ExtRactor and MINEr☆512Jun 30, 2022Updated 3 years ago
- my take at a PDF text extraction utility☆15Jun 15, 2015Updated 10 years ago
- Grobid module for superconductor material and properties extraction☆22May 17, 2025Updated 10 months ago
- A web application for organizing the research notes of humanities scholars☆26Oct 4, 2016Updated 9 years ago
- Java based viewer for PAGE XML files (layout + text content). Also supports ALTO XML, FineReader XML, and HOCR.☆35May 25, 2023Updated 2 years ago