PDF to XML ALTO file converter
☆269May 10, 2026Updated 2 weeks ago
Alternatives and similar repositories for pdfalto
Users that are interested in pdfalto are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- pdf2xml convertor based on Xpdf library - modified version☆27Feb 23, 2018Updated 8 years ago
- A browser extension providing Open Access bibliographical services☆18Dec 9, 2022Updated 3 years ago
- A high performance bibliographic information service: https://biblio-glutton.readthedocs.io☆149Apr 8, 2026Updated last month
- Convert between Tesseract hOCR and ALTO XML using XSL stylesheets☆60Mar 20, 2026Updated 2 months ago
- Open Access PDF harvester☆42May 3, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Softcite software mention recognizer, finding mentions and citations to software from within the academic literature☆84Apr 16, 2026Updated last month
- Service for converting and enhancing heterogeneous publisher XML formats into TEI☆65Apr 12, 2026Updated last month
- A machine learning software for extracting information from scholarly documents☆4,890May 14, 2026Updated 2 weeks ago
- Some examples of usage of Grobid in a third party java project.☆20Jun 14, 2023Updated 2 years ago
- Conversions between various OCR formats☆84Feb 13, 2026Updated 3 months ago
- A machine learning tool for fishing entities☆268Feb 27, 2026Updated 3 months ago
- Python client for GROBID Web services☆406Mar 5, 2026Updated 2 months ago
- GROBID extension for identifying and normalizing physical quantities.☆84Apr 8, 2026Updated last month
- a Deep Learning Framework for Text https://delft.readthedocs.io/☆415May 11, 2026Updated 2 weeks ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Finding mentions and citations to named and implicit research datasets from within the academic literature☆30Jun 14, 2025Updated 11 months ago
- ☆33Nov 16, 2022Updated 3 years ago
- A Named-Entity Recogniser based on Grobid.☆55May 14, 2025Updated last year
- ALTO XML schema - latest and all former versions☆55Jan 20, 2026Updated 4 months ago
- Java command line tool to convert PAGE XML files with layout and text content to PDF☆10Apr 27, 2020Updated 6 years ago
- Poor man's simple harvester for arXiv resources☆14Jul 14, 2023Updated 2 years ago
- Bouton ISTEX : extension web capable d'insérer dynamiquement sur la page web consultée un lien vers le fulltext d'un document si ce dern…☆11May 30, 2023Updated 2 years ago
- wrapper for the crossref events api☆24May 23, 2023Updated 3 years ago
- Open Access PDF harvester, metadata aggregator and full-text ingester☆62May 3, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Python tools for performing various operations on ALTO XML files☆49Feb 27, 2025Updated last year
- Document Layout Analysis resources repos for development with PdfPig.☆635Oct 1, 2023Updated 2 years ago
- Science-parse version 2☆257Nov 20, 2019Updated 6 years ago
- Knowledge Base stuff☆23Mar 1, 2026Updated 2 months ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆202May 21, 2025Updated last year
- A Knowledge Base for research software relying on large-scale text mining and curated knowledge sources☆17May 14, 2023Updated 3 years ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Mar 8, 2022Updated 4 years ago
- Material parsers and other tools, scripts Initially developed for Grobid Superconductor☆13Feb 21, 2025Updated last year
- XSLT for converting TEI MsDescription to IIIF manifests☆13Oct 18, 2016Updated 9 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Scientific articles using or citing Common Crawl data☆29Mar 19, 2026Updated 2 months ago
- ☆13Sep 4, 2015Updated 10 years ago
- Content ExtRactor and MINEr☆512Jun 30, 2022Updated 3 years ago
- my take at a PDF text extraction utility☆15Jun 15, 2015Updated 10 years ago
- Grobid module for superconductor material and properties extraction☆22May 17, 2025Updated last year
- A web application for organizing the research notes of humanities scholars☆26Oct 4, 2016Updated 9 years ago
- Java based viewer for PAGE XML files (layout + text content). Also supports ALTO XML, FineReader XML, and HOCR.☆36May 25, 2023Updated 3 years ago