pdf2xml convertor based on Xpdf library - modified version
☆27Feb 23, 2018Updated 8 years ago
Alternatives and similar repositories for pdf2xml
Users that are interested in pdf2xml are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- PDF to XML ALTO file converter☆269May 30, 2026Updated last week
- ☆18Apr 6, 2021Updated 5 years ago
- Project based at the Bond University Center for Research in Evidence-Based Practice (CREBP) with the aim of drastically reducing the time…☆15Aug 28, 2017Updated 8 years ago
- A Knowledge Base for research software relying on large-scale text mining and curated knowledge sources☆18May 14, 2023Updated 3 years ago
- OpenQuant通视股票全推行情接口,已经不再维护,请移步XAPI2项目☆13Sep 12, 2013Updated 12 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Some examples of usage of Grobid in a third party java project.☆20Jun 14, 2023Updated 2 years ago
- A machine learning software for extracting astronomical entities from scholarly documents☆10Oct 31, 2022Updated 3 years ago
- Code for pre-training CharacterBERT models (as well as BERT models).☆34Sep 6, 2021Updated 4 years ago
- Pre-processing text and tokenization for UTH-BERT☆10Sep 30, 2020Updated 5 years ago
- ☆11Apr 15, 2022Updated 4 years ago
- copy of pdftohtml code with enhancements☆25Nov 18, 2023Updated 2 years ago
- Collection of LaTeX utility packages for scientific documents☆17Sep 13, 2023Updated 2 years ago
- Terminal tool that converts files encoding to UTF-8☆10Oct 5, 2019Updated 6 years ago
- ☆21May 1, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆15Dec 8, 2022Updated 3 years ago
- A machine learning software for extracting information from scholarly documents☆23Jan 12, 2021Updated 5 years ago
- Line shuffler for huge text file which does not fit in memory☆13Dec 1, 2022Updated 3 years ago
- DatasetImgLabeler is a image annotation tool for researchers to prepare datasets in ICDAR2015 format☆12Dec 7, 2019Updated 6 years ago
- Automagically ignore all notifications related to work when you are on vacations☆21Aug 21, 2020Updated 5 years ago
- Web-based page layout editor created for EMOP (Early Modern OCR Project).☆11May 21, 2021Updated 5 years ago
- GROBID extension for identifying and normalizing physical quantities.☆85Apr 8, 2026Updated 2 months ago
- Tutorial on running keras model in C++ and python tensorflow☆11Oct 30, 2018Updated 7 years ago
- character recognition, textline recognition☆10Aug 31, 2019Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A Lucene Indexer for XML, with lexical analysis (lemmatization for French)☆18Updated this week
- Poor man's simple harvester for arXiv resources☆14Jul 14, 2023Updated 2 years ago
- GC4LM: A Colossal (Biased) language model for German☆13May 2, 2021Updated 5 years ago
- ☆10Mar 16, 2023Updated 3 years ago
- ☆10Aug 5, 2019Updated 6 years ago
- Async procedures for Clojure☆13Oct 5, 2022Updated 3 years ago
- Java command line tool to convert PAGE XML files with layout and text content to PDF☆10Apr 27, 2020Updated 6 years ago
- Spark MLib Training Models for Network Security☆16Mar 19, 2018Updated 8 years ago
- ☆10Apr 21, 2020Updated 6 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- All released PHP distributions☆35Jun 3, 2026Updated last week
- A browser extension providing Open Access bibliographical services☆18Dec 9, 2022Updated 3 years ago
- Convert annotation file in Pascal VOC format (.xml or .json) to COCO format. Partition the dataset and annotations into training and vali…☆10Apr 2, 2020Updated 6 years ago
- An Implementation of ERNIE For Language Understanding (including Pre-training models and Fine-tuning tools)☆27Jul 30, 2019Updated 6 years ago
- Extension for pie to include taggers with their models and pre/postprocessors☆11May 30, 2024Updated 2 years ago
- Core libraries by the PRImA Research Lab☆16Jul 30, 2024Updated last year
- ☆10May 24, 2019Updated 7 years ago