lukewhyte / textpack
Group thousands of similar spreadsheet or database text entries in seconds
☆155Updated last year
Related projects ⓘ
Alternatives and complementary repositories for textpack
- Dataframe Integration with spaCy.☆102Updated 3 years ago
- Abydos NLP/IR library for Python☆183Updated 2 years ago
- Fuzzy matching and more functionality for spaCy.☆252Updated 4 months ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆153Updated 2 years ago
- Simplifies use of the Dedupe library via Pandas☆136Updated last year
- ☆67Updated 2 years ago
- Text analysis with networks.☆284Updated 6 months ago
- Notebooks configured to be run with Binder, usually found on my blog.☆41Updated last year
- Gain clues from clustering!☆305Updated 4 months ago
- Fuzzy matches and merging of datasets in pandas using csvmatch☆74Updated 4 years ago
- A Python module to convert natural language numerics into ints and floats.☆225Updated last month
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆281Updated 2 years ago
- Text Mining and Topic Modeling Toolkit for Python with parallel processing power☆193Updated last year
- Interpretable data visualizations for understanding how texts differ at the word level☆273Updated 4 months ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆60Updated this week
- Python package to accelerate the sparse matrix multiplication and top-n similarity selection☆399Updated last month
- ☄️ Parallel and distributed training with spaCy and Ray☆54Updated last year
- Today I Learned Some Computer Stuff☆39Updated 6 years ago
- Fast, flexible name matching for large datasets☆70Updated 11 months ago
- A maximum-strength name parser for record linkage.☆34Updated 3 months ago
- 🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy☆287Updated last year
- Bag of, not words, but tricks!☆68Updated last year
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆59Updated this week
- Labelling platform for text using weak supervision.☆260Updated 2 years ago
- A Flexible Deep Learning Approach to Fuzzy String Matching☆140Updated last month
- Fuzzy string matching, grouping, and evaluation.☆748Updated 6 months ago
- Using ML to extract campaign finance data from messy forms for journalism☆76Updated 2 years ago
- Super Fast String Matching in Python☆364Updated 6 months ago
- Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.☆88Updated 2 years ago
- PYthon Automated Term Extraction☆305Updated last year