booktype / python-ooxmlLinks
Python library for parsing .docx (Office Open XML) files
☆53Updated 5 years ago
Alternatives and similar repositories for python-ooxml
Users that are interested in python-ooxml are comparing it to the libraries listed below
Sorting:
- Python 3 port of pdfminer☆187Updated 7 years ago
- Create, read, and modify Excel .xlsx files☆114Updated 5 years ago
- An extendable docx file format parser and converter☆194Updated 8 months ago
- Python library for manipulating Open Packaging Convention (OPC) files like .docx, .pptx, and .xslx☆47Updated 8 years ago
- Text (source code) search engine with indexer and a front end web interface to search. Uses Python 3.☆126Updated 2 years ago
- Python bindings for CHMLIB☆57Updated 7 months ago
- Convert a docx (OOXML) file to html. This project is deprecated in favor of https://github.com/OpenScienceFramework/pydocx☆47Updated 11 years ago
- ☆43Updated last week
- Fast Indexed python HTML parser which builds a DOM node tree, providing common getElementsBy* functions for scraping, testing, modificati…☆102Updated 2 years ago
- A utility to read and write PDFs with Python☆338Updated 4 years ago
- Fast multi-keyword search engine for text strings☆258Updated last year
- Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.☆341Updated last year
- A fast, pure-Python, untyped, in-memory database engine, using Python syntax to manage data, instead of SQL, inspired by PyDbLite.☆20Updated 8 years ago
- A pure python based utility to extract text and images from docx files.☆580Updated 10 months ago
- Python bindings for WPS Office RPC (for Linux)☆281Updated 9 months ago
- Pure python Aho-Corasick library.☆220Updated last week
- A Python tool to help extracting information from structured PDFs.☆427Updated last month
- Convert html to docx☆86Updated last year
- Constants used in Chinese text processing☆387Updated last year
- Python module for JSON data encoding, including jsonlint. See the project Wiki here on Github. Also read the README at the bottom of th…☆306Updated 5 years ago
- Insert HTML or Markdown into a Word document☆86Updated 5 years ago
- Language Savant, Python clone of github/linguist.☆154Updated 5 years ago
- Phantompy is a headless WebKit engine with powerful pythonic api build on top of Qt5 Webkit☆612Updated 8 years ago
- bamboo是一个中文语言处理系统。☆14Updated 14 years ago
- A readability parser which can extract title, content, images from html pages☆86Updated 5 years ago
- Python bindings for SQLCipher☆136Updated 3 years ago
- 🐍 A CPython extension for the Hyperscan regular expression matching library.☆190Updated last month
- web intrface for Advanced Python Scheduler☆53Updated 13 years ago
- ☆61Updated 6 years ago
- Jabba's headless webkit browser for scraping AJAX-powered webpages.☆90Updated 11 years ago