timClicks / slate
The simplest way to extract text from PDFs in Python
☆427Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for slate
- A fast and friendly PDF scraping library.☆772Updated last year
- Python2's stdlib csv module is nice, but it doesn't support unicode. This module is a drop-in replacement which *does*. If you prefer p…☆594Updated last year
- A more complete example of programming with PDFMiner, which continues where the default documentation stops☆215Updated 4 years ago
- A python script for summarizing articles using nltk☆542Updated 8 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆144Updated 10 months ago
- A simple Python module for parsing human names into their individual components☆658Updated 5 months ago
- [not actively maintained] A lightweight Python library that uses Webkit to enable easy scraping of dynamic, Javascript-heavy web pages☆533Updated 7 years ago
- Train NLTK objects with zero code☆747Updated 4 years ago
- Python interface to the Stanford Named Entity Recognizer☆293Updated 3 years ago
- Mailing for human beings☆590Updated 5 years ago
- Python Wrapper for NVD3 - It's time for beautiful charts☆663Updated 7 months ago
- Import tables from any Wikipedia article as a dataset in Python☆292Updated 3 years ago
- pdfrw is a pure Python library that reads and writes PDFs☆1,871Updated 6 months ago
- Python charting for 80% of humans.☆331Updated 9 months ago
- Twitter text processing library (auto linking and extraction of usernames, lists and hashtags).☆180Updated 5 years ago
- Converts XML to Python objects☆614Updated 10 months ago
- Iterative JSON parser with Pythonic interface☆616Updated 4 years ago
- [DEPRECATED] Buildpack for Conda.☆157Updated 4 years ago
- Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.☆1,511Updated 7 months ago
- python bindings to crunchbase☆65Updated last year
- pyxDamerauLevenshtein implements the Damerau-Levenshtein (DL) edit distance algorithm for Python in Cython for high performance.☆243Updated 6 months ago
- Unicode transliteration in Python (clone of Tomaž Šolc repository at zemanta.com)☆114Updated 9 years ago
- Official Python client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Python apps.☆166Updated 3 years ago
- PyPrind - Python Progress Indicator Utility☆549Updated 3 years ago
- A Python toolkit for processing tabular data☆416Updated 3 months ago
- Python application for generating pseudo-random data☆126Updated 4 years ago
- Extract countries, regions and cities from a URL or text☆220Updated 4 years ago
- Find dates inside text using Python and get back datetime objects☆635Updated 6 months ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarity☆382Updated 2 years ago
- Reads, queries and modifies Microsoft Word 2007/2008 docx files.☆1,072Updated 9 years ago