felipeochoa / minecartLinks
Simple, Pythonic extraction of text, shapes and images from PDFs
☆79Updated 5 years ago
Alternatives and similar repositories for minecart
Users that are interested in minecart are comparing it to the libraries listed below
Sorting:
- Python interface to Apache PDFBox command-line tools.☆75Updated 2 years ago
- Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX, PPT, PNG or JPG.☆105Updated 2 years ago
- Hunspell extension for spaCy 2.0.☆94Updated 10 months ago
- Get list of common stop words in various languages in Python☆156Updated last year
- A fully customisable language detection pipeline for spaCy☆93Updated 6 years ago
- Language detection extension for spaCy 2.0+☆113Updated 6 years ago
- Library for unit extraction - fork of quantulum for python3☆141Updated last year
- Extract dates from text☆64Updated 4 years ago
- Soundex Phonetic Code Algorithm Demo for Indian Languages. Supports all indian languages and English. Provides intra-indic string compari…☆58Updated 6 years ago
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆66Updated 2 years ago
- Textpipe: clean and extract metadata from text☆302Updated 4 years ago
- (Official repo for pypi package) Python bindings for the Hunspell spellchecker engine☆186Updated 4 years ago
- Server/Client around Spacy to load spacy only once☆46Updated 7 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆152Updated 5 months ago
- Convert number words (eg. twenty one) to numeric digits (21)☆176Updated last year
- Handle many API calls from a single HTTP request☆55Updated 6 years ago
- A more complete example of programming with PDFMiner, which continues where the default documentation stops☆214Updated 5 years ago
- Relatively simple text classification powered by spaCy☆41Updated 9 years ago
- Recipe for Spanish POS tagging using the CESS corpus with NLTK☆18Updated 8 years ago
- ☆171Updated 3 months ago
- Perform lexical analysis on words, one word at a time.☆64Updated 7 years ago
- Detect Language API Python Client☆70Updated 3 years ago
- AsyncIO serving for data science models☆24Updated 2 years ago
- International Address formatter which considers the standard formatting rules of the country☆26Updated 3 years ago
- 💙 Emoji handling and meta data for spaCy with custom extension attributes☆181Updated 2 years ago
- A python module to parse the Open Graph Protocol☆231Updated 3 years ago
- I analysed online user comments on articles by German news publishers SPON, ZEIT, and Focus☆19Updated 7 years ago
- A compound word splitter for Python☆48Updated 3 years ago
- Python version of the SymSpell Compound algorithm☆12Updated 6 years ago
- Binary Python bindings for poppler utils for content extraction☆42Updated 4 years ago