JonathanRaiman / epub_conversion
Python package for converting xml and epubs to text files
☆34Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for epub_conversion
- Set-oriented Operations in Pandas☆24Updated 4 years ago
- A maximum-strength name parser for record linkage.☆32Updated 3 months ago
- A utility for labeling clusters of text data.☆28Updated 3 years ago
- A curated list of ML awesome frameworks & libraries for text data☆16Updated last year
- A Flask webapp that categorizes Outlook emails using machine learning☆15Updated 9 years ago
- A python module that will check for package updates.☆28Updated 3 years ago
- Python and pandas tools to perform various analyses on different types of word lists☆16Updated 9 years ago
- Find which links on a web page are pagination links☆29Updated 7 years ago
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 3 years ago
- A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…☆65Updated 2 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated last year
- ☆29Updated 2 years ago
- Remove duplicate documents/videos/images via popular algorithms such as SimHash, SpotSig, Shingling, etc.☆16Updated last year
- A natural language date parser. (Python version of chrono.js)☆25Updated 5 months ago
- Aho-Corasick string replacement utility☆23Updated 4 years ago
- Collaboration app for sharing and reviewing jupyter notebooks☆16Updated last year
- Comparing Polars to Pandas and a small introduction☆43Updated 3 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- Simple tools for summarizing .mbox email archives.☆10Updated 4 years ago
- Binary Python bindings for poppler utils for content extraction☆42Updated 3 years ago
- Automatically install missing Python modules using pip at import time.☆18Updated 10 months ago
- Collection of code snippets and utilities for streamlit apps☆22Updated 4 years ago
- December 14th Python Meetup Files☆37Updated 11 years ago
- Python library for modern thread / multiprocessing pooling and task processing via asyncio☆15Updated 3 years ago
- an app that makes your personalized newsletter based on your bookmarks☆11Updated 7 years ago
- Datasette plugin for authenticating access using API tokens☆12Updated 2 months ago
- Automatically exported from code.google.com/p/guess-language☆53Updated 8 months ago
- (Archived) A Python library for record linkage and deduplication.☆19Updated 7 months ago
- stemgraphic python package for visualization of data and text☆17Updated 3 years ago