dfop02 / html4docxLinks
Convert html to docx
☆45Updated last week
Alternatives and similar repositories for html4docx
Users that are interested in html4docx are comparing it to the libraries listed below
Sorting:
- Extract docx headers, footers, (formatted) text, footnotes, endnotes, properties, and images.☆192Updated last week
- Convert html to docx☆83Updated last year
- A python library to make filling pdfs much easier☆153Updated last year
- Python bindings to PDFium, reasonably cross-platform.☆647Updated last week
- Benchmarking PDF libraries☆312Updated 3 months ago
- Docx tracked change redlines for the Python ecosystem.☆83Updated last year
- A fast, comprehensive, ISO 639 library.☆43Updated last month
- Streamlit PDF viewer☆177Updated 2 weeks ago
- Python API for PDF documents☆124Updated last year
- Demos, examples and utilities using PyMuPDF☆683Updated last year
- A Python asyncio wrapper for Tesseract-OCR.☆26Updated last week
- A curated list of resources around PDF files☆141Updated last year
- Simplify DOCX files to JSON☆251Updated last year
- A simple library for segmenting legal texts☆17Updated 2 years ago
- Python wrapper for epubcheck☆21Updated last year
- A Python tool to help extracting information from structured PDFs.☆415Updated last week
- A fast and scalable app that adds vector search capabilities to your Django applications. It offers low latency, fast search results, nat…☆79Updated 2 weeks ago
- NLP Web API for Legal Text☆18Updated 2 years ago
- Repository for deepdoctection tutorial notebooks☆46Updated 3 months ago
- A python library to define and validate data types in Docling.☆185Updated this week
- Python interface to Apache PDFBox command-line tools.☆77Updated 2 years ago
- Show the differences between two strings/text as a compact text, in markdown/HTML, in the terminal and more.☆140Updated 3 months ago
- Integration between Django and LangChain☆26Updated last year
- A Python library to extract tabular data from PDFs☆66Updated 5 months ago
- Append/Concatenate .docx documents☆120Updated last year
- Excel spreadsheet crawler and table parser for data extraction and querying☆157Updated 7 months ago
- LitePali is a minimal, efficient implementation of ColPali for image retrieval and indexing, optimized for cloud deployment.☆65Updated 11 months ago
- Aspose.Words for Python via .NET examples and showcases☆126Updated 3 weeks ago
- Next-generation Punkt sentence boundary detection with zero dependencies☆17Updated 2 months ago
- Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.☆221Updated last month