BayooG / bayoo-docx
Create and modify Word documents with Python
☆145Updated 10 months ago
Alternatives and similar repositories for bayoo-docx:
Users that are interested in bayoo-docx are comparing it to the libraries listed below
- A Python tool to help extracting information from structured PDFs.☆402Updated 3 weeks ago
- Extract docx headers, footers, (formatted) text, footnotes, endnotes, properties, and images.☆179Updated this week
- The Python docx package cannot read paragraphs, tables and images in document order. It can only render all the paragraphs at once or all…☆77Updated last year
- ☆169Updated 3 weeks ago
- A simple client for doccano API.☆85Updated 10 months ago
- Python interface to Apache PDFBox command-line tools.☆75Updated 2 years ago
- Demos, examples and utilities using PyMuPDF☆646Updated 9 months ago
- ☆79Updated 3 years ago
- Implementation of the paper: Text Segmentation as a Supervised Learning Task☆261Updated 5 years ago
- Extract dates from text☆64Updated 4 years ago
- A tiny, generic implementation of the Myers diff algorithm☆20Updated 4 years ago
- ☆35Updated 2 weeks ago
- ⚡️ 80x faster Fasttext language detection out of the box | Split text by language☆189Updated 3 weeks ago
- Streamlit Named Entity Recognition (NER) annotation custom component☆38Updated 2 years ago
- 80x faster and 95% accurate language identification with Fasttext☆152Updated last year
- A pure python based utility to extract text and images from docx files.☆544Updated last month
- ☆94Updated 4 years ago
- Pure-python library for adding annotations to PDFs☆201Updated 4 years ago
- PDF to XML ALTO file converter☆237Updated last week
- Python bindings to PDFium☆560Updated this week
- An unofficial Pytorch implementation of ERNIE-Layout which is originally released through PaddleNLP.☆105Updated last year
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆151Updated last year
- Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:☆274Updated 2 years ago
- Python module that identifies Chinese text as being Simplified or Traditional☆91Updated 5 months ago
- ☆438Updated 3 years ago
- ☆38Updated 4 years ago
- Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understan…☆346Updated 2 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆154Updated 5 months ago
- Find parts of long text or data, allowing for some changes/typos.☆319Updated 8 months ago
- DocBank: A Benchmark Dataset for Document Layout Analysis☆605Updated 8 months ago