CenterForOpenScience / pydocx
An extendable docx file format parser and converter
☆190Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for pydocx
- Python package for Google's diff-match-patch native C++ implementation.☆73Updated 5 months ago
- CSS Selectors for Python☆291Updated last month
- A library for extracting tables from PDF files☆88Updated 4 years ago
- Whoosh + SQLAlchemy☆32Updated 7 years ago
- xmlsjon converts XML into Python dictionary structures (trees, like in JSON) and vice-versa.☆122Updated last year
- Convert html to docx☆74Updated 4 months ago
- Convert a docx (OOXML) file to html. This project is deprecated in favor of https://github.com/OpenScienceFramework/pydocx☆45Updated 10 years ago
- Customizable Flask - SQLAlchemy - Whoosh integration☆85Updated 9 months ago
- Convert Word documents (.docx files) to HTML☆815Updated 5 months ago
- Generate Pandas frames, load and extract data, based on JSON Table Schema descriptors.☆52Updated 3 years ago
- Python library for parsing .docx (Office Open XML) files☆51Updated 4 years ago
- Python pagination module☆78Updated 2 months ago
- Super-fast and clean conversions to numbers for Python.☆106Updated last week
- A more complete example of programming with PDFMiner, which continues where the default documentation stops☆215Updated 4 years ago
- Transport adapter for fetching file:// URLs with the requests python library☆86Updated 4 months ago
- Crochet: use Twisted anywhere!☆236Updated 2 months ago
- Backport of Python 3's csv module for Python 2☆64Updated 4 years ago
- Python library for manipulating Open Packaging Convention (OPC) files like .docx, .pptx, and .xslx☆42Updated 7 years ago
- Create, read, and modify Excel .xlsx files☆103Updated 4 years ago
- Fork of ReportLab http://www.reportlab.com/ftp/reportlab-2.5.tar.gz☆36Updated last year
- Generate PDF files out of your Flask website thanks to WeasyPrint☆144Updated this week
- URL normalization for Python☆94Updated 2 years ago
- Offering FullText Search of MySQL in SQLAlchemy☆91Updated 3 years ago
- minimalistic evaluator of python expression using ast module☆183Updated last month
- Conservatively convert html to markdown☆98Updated 4 years ago
- Append/Concatenate .docx documents☆104Updated 3 months ago
- Extract docx headers, footers, (formatted) text, footnotes, endnotes, properties, and images.☆167Updated this week
- Binary Python bindings for poppler utils for content extraction☆42Updated 3 years ago
- A utility to read and write PDFs with Python☆332Updated 2 years ago
- Ultra-lightweight pure Python package to check if a file is binary or text.☆140Updated 4 months ago