JonathanRaiman / epub_conversion
Python package for converting xml and epubs to text files
☆34Updated 4 years ago
Alternatives and similar repositories for epub_conversion:
Users that are interested in epub_conversion are comparing it to the libraries listed below
- A python module that will check for package updates.☆28Updated 3 years ago
- A utility for labeling clusters of text data.☆28Updated 3 years ago
- Aho-Corasick string replacement utility☆24Updated 5 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated 2 years ago
- Python and pandas tools to perform various analyses on different types of word lists☆16Updated 10 years ago
- A curated list of ML awesome frameworks & libraries for text data☆16Updated 2 years ago
- Find which links on a web page are pagination links☆29Updated 8 years ago
- A scraping Master-slave system based on Google App Engine☆11Updated 4 years ago
- CLI based diff viewer☆23Updated 3 years ago
- Tracebacks for Humans (in Jupyter notebooks)☆12Updated 2 months ago
- Automatically exported from code.google.com/p/guess-language☆53Updated last year
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 4 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Set-oriented Operations in Pandas☆24Updated 4 years ago
- A financial disclosure data extraction tool.☆14Updated last year
- Python wrapper for Ferret☆40Updated 3 years ago
- Where I keep my Python notes for starting projects☆9Updated 2 years ago
- Binary Python bindings for poppler utils for content extraction☆42Updated 3 years ago
- Loadable spellfix1 extension for sqlite as python package☆26Updated 11 months ago
- 🐾 PdpCLI is a pandas DataFrame processing CLI tool which enables you to build a pandas pipeline from a configuration file.☆15Updated last year
- Python based Wikidata framework for easy dataframe extraction☆43Updated last year
- Find duplicate text files.☆14Updated 2 months ago
- A Python package that simplifies the use of secrets in a Jupyter notebook☆21Updated 3 years ago
- ☆12Updated 8 years ago
- 🌸 Train floret vectors☆18Updated last year
- Dask tutorial for PyData DC 2016☆11Updated 8 years ago
- Statistical visualizations for Datasette using Seaborn☆12Updated 3 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆148Updated 2 months ago
- an app that makes your personalized newsletter based on your bookmarks☆11Updated 7 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago