mgedmin/pdf2html

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mgedmin/pdf2html)

mgedmin / pdf2html

Wrapper for pdftohtml that tries to extract paragraph structure

☆52

Alternatives and similar repositories for pdf2html

Users that are interested in pdf2html are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Garee / jchess
View on GitHub
A simple chess engine
☆11Dec 16, 2018Updated 7 years ago
Philip-Bachman / NN-Python
View on GitHub
The useful and used parts of NN-Dropout
☆25Jun 4, 2015Updated 11 years ago
adiyoss / StructED
View on GitHub
Risk Minimization Algorithms in Structured Prediction (JMLR 2016)
☆13Jan 26, 2017Updated 9 years ago
City-of-Helsinki / django-munigeo
View on GitHub
Reusable Django application for storing and accessing municipality-related geospatial data
☆14Jun 5, 2026Updated last month
DongjunLee / DeepLearning-Notebooks
View on GitHub
Deep Learning Notebooks Implements by TensorFlow, Python + numpy
☆12May 3, 2017Updated 9 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Esukhia / sympound-python
View on GitHub
Python version of the SymSpell Compound algorithm
☆12Sep 18, 2018Updated 7 years ago
marisademeglio / media-overlays-js
View on GitHub
EPUB Media Overlays javascript implementation
☆14Aug 19, 2016Updated 9 years ago
ddevaraj / docker-brat
View on GitHub
Dockerization of brat application
☆13Jun 13, 2018Updated 8 years ago
bwagner5 / Dynamic-IP-Route53
View on GitHub
Updates a Route53 Zone with your computer's public IP
☆12May 21, 2024Updated 2 years ago
julianthome / autorex
View on GitHub
A dk.brics FSM to regular-expression-string converter
☆10Jul 12, 2025Updated last year
uci-cbcl / GBMCI
View on GitHub
The implementation of gradient boosting machine for concordance index learning.
☆16Oct 8, 2013Updated 12 years ago
AKSW / Mandolin
View on GitHub
❇️ The best modules for Markov Logic Networks condensed in one framework.
☆13Dec 20, 2017Updated 8 years ago
CoEDL / elan-helpers
View on GitHub
Tools and scripts for working with ELAN
☆10Aug 4, 2022Updated 3 years ago
eliask / pdfssa4met
View on GitHub
PDF Structure and Syntactic Analysis for Metadata Extraction and Tagging - https://code.google.com/p/pdfssa4met/
☆19Mar 6, 2013Updated 13 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
dhdaines / paves
View on GitHub
Bajo los adoquines, la PLAYA 🏖️
☆17Jul 3, 2026Updated 2 weeks ago
watsonbox / sphinxtrain-ruby
View on GitHub
Toolkit for training/adapting CMU Sphinx acoustic models
☆17May 25, 2018Updated 8 years ago
KMCS-NII / PDFNLT-1.0
View on GitHub
Tools for Natural Language Text aware PDF structure analysis
☆15Mar 11, 2022Updated 4 years ago
tmthrgd / shoco
View on GitHub
shoco is a compressor for small text strings. [Not maintained].
☆11Sep 4, 2019Updated 6 years ago
crujzo / Para-Phrase
View on GitHub
Please visit this repo for enhanced and updated open source code
☆14Dec 14, 2025Updated 7 months ago
rollecode / personal-assistant-cli
View on GitHub
Prioritize your Todoist tasks via OpenAI and save them to Obsidian.
☆19Jan 12, 2025Updated last year
cnigfr / structuration-reglement-urbanisme
View on GitHub
dépôt des fichiers des travaux du SG6 du GT DDU
☆15Jun 30, 2026Updated 3 weeks ago
muxuezi / pdf2md3
View on GitHub
pdf to markdown with Python3
☆10Oct 30, 2019Updated 6 years ago
ryandw11 / Octree
View on GitHub
An octree library for Java.
☆10Aug 21, 2020Updated 5 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
AustinDizzy / davine
View on GitHub
A social analytics service built for Twitter's Vine.
☆13Oct 16, 2015Updated 10 years ago
AlasdairF / Tokenize
View on GitHub
All-in-one text tokenizer for Go. Super-fast. Lots of features.
☆13Dec 18, 2015Updated 10 years ago
stickeritis / sticker2
View on GitHub
Further developed as SyntaxDot: https://github.com/tensordot/syntaxdot
☆13Dec 18, 2020Updated 5 years ago
arne-cl / brat-embedded-visualization-examples
View on GitHub
minimal examples of brat annotation visualizations
☆17Jan 21, 2015Updated 11 years ago
howl-anderson / tf_crf_layer
View on GitHub
CRF(Conditional Random Field) Layer for TensorFlow 1.X with many powerful functions
☆15Jan 3, 2020Updated 6 years ago
dangjaya / drugAI
View on GitHub
☆11Jun 16, 2024Updated 2 years ago
FrancisGregoire / parSentExtract
View on GitHub
A BiRNN framework implemented in Python and TensorFlow to extract parallel sentences from aligned comparable corpora.
☆33Sep 4, 2018Updated 7 years ago
cltl / svm_wsd
View on GitHub
Word Sense Disambiguation system developed on the DutchSemCor project using Support Vector Machines. The input is plain text, and the out…
☆12Feb 5, 2019Updated 7 years ago
ucam-smt / sgnmt
View on GitHub
Decoding platform for machine translation research
☆54Aug 24, 2019Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
casetext / r-and-r
View on GitHub
Code for the "Long Context Needs Some R&R" paper.
☆12Mar 11, 2024Updated 2 years ago
robertdebock / docker-alpine-openrc
View on GitHub
Container to test Ansible roles in, including capabilities to use openrc facilities
☆11Sep 24, 2025Updated 10 months ago
ai-wand / concise-reasoning
View on GitHub
Concise Reasoning via Reinforcement Learning
☆13Apr 16, 2025Updated last year
standardhealth / standardhealth.github.io
View on GitHub
Standard Health Record Collaborative
☆22Aug 2, 2024Updated last year
explosion / vscode-prodigy
View on GitHub
🧬 A VS Code extension for annotating data with Prodigy
☆30Nov 25, 2021Updated 4 years ago
brendano / mte
View on GitHub
MiTextExplorer - interactive browser of text and document covariates.
☆24Jun 17, 2015Updated 11 years ago
baulbo / Diard
View on GitHub
From document (PDF) or document images to analysis ready semi-structured data.
☆20Nov 4, 2022Updated 3 years ago