ecatkins / xpdf_pythonLinks
Python wrapper for xpdf
☆19Updated 5 years ago
Alternatives and similar repositories for xpdf_python
Users that are interested in xpdf_python are comparing it to the libraries listed below
Sorting:
- Calculate readability scores☆42Updated 6 years ago
- Extract dates from text☆64Updated 4 years ago
- A python client for connecting to all the services provided by https://dandelion.eu☆36Updated last year
- Using ML to extract campaign finance data from messy forms for journalism☆76Updated 2 years ago
- ☆19Updated 3 years ago
- Named entity recognition for the legal domain☆42Updated 4 years ago
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 4 years ago
- Language detection using Spacy and Fasttext☆55Updated last year
- ☆32Updated 6 years ago
- semantically distinct key phrase extraction using hilbert hashes.☆49Updated 3 years ago
- Build intelligent data-driven applications with minimal effort. Sentence Clustering, Topics Extraction, Text Similarity, Opinion Summariz…☆40Updated 5 years ago
- ☆17Updated 3 years ago
- Analyze XML extracted from PDFs (e.g. from TET or PDFMiner)☆20Updated 7 years ago
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- How to do data science with Optimus, Spark and Python.☆19Updated 5 years ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆62Updated this week
- Python version of the SymSpell Compound algorithm☆12Updated 6 years ago
- text-data pre-processing utility☆13Updated 2 years ago
- THIS REPOSITORY IS FORK☆30Updated 2 years ago
- NSS Capstone project to use natural language modeling, classification, and information extraction to get the exact employee count values …☆15Updated 6 years ago
- 🚀GUI for training spaCy models☆55Updated 4 years ago
- Natural Language Generation for Gramex applications.☆25Updated 2 years ago
- ☆22Updated 6 years ago
- Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX, PPT, PNG or JPG.☆105Updated 2 years ago
- Simple, Pythonic extraction of text, shapes and images from PDFs☆79Updated 5 years ago
- Dataframe Integration with spaCy.☆103Updated 4 years ago
- A visualisation tool for Spacy using Hierplane.☆65Updated 2 years ago
- Generic Environment for Context-Aware Correction of Orthography☆22Updated 2 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆26Updated 2 years ago
- [Project INVALID not supported anymore]☆37Updated 5 years ago