sybrenjansen / text-scrubberLinks
Python package that offers text scrubbing functionality, providing building blocks for string cleaning as well as normalizing geographical text (countries/states/cities)
☆22Updated 9 months ago
Alternatives and similar repositories for text-scrubber
Users that are interested in text-scrubber are comparing it to the libraries listed below
Sorting:
- Versatile Metrics Collection for Python☆19Updated last year
- ☆30Updated 3 years ago
- ☆70Updated 2 years ago
- A maximum-strength name parser for record linkage.☆37Updated last week
- A utility for labeling clusters of text data.☆28Updated 3 years ago
- Language detection using Spacy and Fasttext☆55Updated last year
- Declarative layer for your database.☆37Updated 2 years ago
- A simple and streamlined Python script to extract and filter links from a remote HTML resource.☆24Updated 5 months ago
- ☆69Updated 3 years ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- AlgoTree☆16Updated 5 months ago
- Elemental makes Selenium automation faster and easier.☆36Updated last year
- Python Simple Object Storage - provides a list and dictionary interface that seamlessly stores data in a file, like a simplified database…☆58Updated 2 years ago
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆66Updated 2 years ago
- Templated docstrings for Python classes☆16Updated last year
- Set-oriented Operations in Pandas☆24Updated 5 years ago
- Custom Python functions for working with SQLite FTS4☆22Updated 2 years ago
- Kubetools is a tool and processes for developing and deploying microservices to Kubernetes.☆14Updated 6 months ago
- Distributed persistent Task Queue running on Dask☆38Updated 2 years ago
- A Python library for verifying code properties using natural language assertions.☆34Updated 3 months ago
- Python package for deduplication/entity resolution using active learning☆80Updated 10 months ago
- Type-aware Python JSON serialization and validation.☆10Updated 4 years ago
- ipython + REPL + coroutines - suffering☆19Updated 10 months ago
- AsyncIO serving for data science models☆24Updated 2 years ago
- 🐾 PdpCLI is a pandas DataFrame processing CLI tool which enables you to build a pandas pipeline from a configuration file.☆15Updated last year
- ☆15Updated 3 years ago
- Python utility project to provide interoperability between JSON Typedef and Protobuf☆27Updated this week
- Next-generation Punkt sentence boundary detection with zero dependencies☆17Updated 2 months ago
- Slipstream provides a data-flow model to simplify development of stateful streaming applications.☆37Updated 2 months ago