dedupeio / dedupe
A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
☆4,141Updated last week
Related projects ⓘ
Alternatives and complementary repositories for dedupe
- Examples for using the dedupe library☆406Updated 3 months ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆963Updated 8 months ago
- NLP, before and after spaCy☆2,215Updated last year
- Command line tool for deduplicating CSV files☆412Updated 4 years ago
- a python library for parsing unstructured western names into name components.☆593Updated last week
- Python implementation of TextRank algorithms ("textgraphs") for phrase extraction☆2,143Updated 3 months ago
- Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.☆9,150Updated this week
- A toolkit for making domain-specific probabilistic parsers☆797Updated last month
- Python library for interactive topic model visualization. Port of the R LDAvis package.☆1,805Updated 4 months ago
- ☆3,148Updated 2 years ago
- a python library for parsing unstructured United States address strings into address components☆1,530Updated last month
- A system for quickly generating training data with weak supervision☆5,807Updated 6 months ago
- Beautiful visualizations of how language differs among document types.☆2,243Updated last month
- sqldf for pandas☆1,341Updated 3 months ago
- 🔮 A refreshing functional take on deep learning, compatible with your favorite libraries☆2,819Updated last month
- 💫 Industrial-strength Natural Language Processing (NLP) in Python☆30,149Updated 2 weeks ago
- 🪼 a python library for doing approximate and phonetic matching of strings.☆2,067Updated last week
- Module for automatic summarization of text documents and HTML pages.☆3,520Updated 5 months ago
- Fuzzy String Matching in Python☆9,225Updated last year
- 🦆 Contextually-keyed word vectors☆1,623Updated 7 months ago
- Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.☆8,743Updated 4 months ago
- Topic Modelling for Humans☆15,657Updated 2 months ago
- A Python implementation of LightFM, a hybrid recommendation algorithm.☆4,767Updated 3 months ago
- Python Extract Transform and Load Tables of Data☆1,248Updated 5 months ago
- python package to calculate readability statistics of a text object - paragraphs, sentences, articles.☆1,146Updated 5 months ago
- Data Migration for the Blaze Project☆1,004Updated 2 years ago
- extract text from any document. no muss. no fuss.☆3,905Updated this week
- Lifetime value in Python☆1,449Updated 4 months ago
- Python bindings to libpostal for fast international address parsing/normalization☆766Updated 4 months ago
- A Python data analysis library that is optimized for humans instead of machines.☆1,173Updated 3 months ago