DistrictDataLabs / entity-resolution
Tutorial code and data for the entity resolution workshops.
☆45Updated 9 years ago
Related projects ⓘ
Alternatives and complementary repositories for entity-resolution
- Algorithms for "schema matching"☆25Updated 8 years ago
- Collection of some algorithms for entity resolution☆28Updated 9 years ago
- A small utility for converting Stanford GloVe vectors to HDF5 / NumPy☆12Updated 7 years ago
- NOTE: skutil is now deprecated. See its sister project: https://github.com/tgsmith61591/skoot. Original description: A set of scikit-lear…☆30Updated 6 years ago
- A Topic Modeling toolbox☆93Updated 8 years ago
- My machine learning model for the See Click Predict Fix Kaggle competition☆31Updated 7 years ago
- Slides for my doc2vec workshop/talk☆29Updated 7 years ago
- Python package aiding in entity disambiguation based on string and location matching☆18Updated last year
- Topic models (just LDA for now) on the Hacker News corpus☆22Updated 9 years ago
- Feature Engineering with Pipeline Talk at ODSC West 2016, Santa Clara☆17Updated 8 years ago
- A Cython implementation of the affine gap string distance☆58Updated last year
- Multidimensional data explorer and visualization tool.☆52Updated 7 years ago
- Tutorial repo for the article "ML in Production"☆30Updated last year
- Word2Vec models with Twitter data using Spark. Blog:☆65Updated 5 years ago
- code and slides for my PyGotham 2016 talk, "Higher-level Natural Language Processing with textacy"☆15Updated 8 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated last year
- Stability analysis for topic models☆50Updated 8 years ago
- Natural Language Processing with Spark's MLlib☆62Updated 7 years ago
- feng - feature engineering for machine-learning champions☆27Updated 7 years ago
- ☆8Updated 7 years ago
- Pydata Seattle 2015 Trend Estimation in Time Series Signals Deck + Notebooks☆21Updated 9 years ago
- ☆11Updated 9 years ago
- ☆37Updated 9 years ago
- Ensemble topic modeling with matrix factorization☆23Updated 6 years ago
- Sample repo for luigi tasks & config☆36Updated 8 years ago
- This project is for the notebooks, code, and data for the "Vocabulary Analysis of Job Descriptions" tutorial at PyData 2017 Seattle☆20Updated 7 years ago
- Using Word2Vec on lists and sets☆34Updated 9 years ago
- Predicting happiness from demographics and poll answers☆45Updated 7 years ago
- Tools for performing hyperparameter search with Scikit-Learn and Dask http://dask-searchcv.readthedocs.io☆11Updated 6 years ago
- Semantic natural language understanding at scale using Spark, machine-learned annotators and deep-learned ontologies☆20Updated 7 years ago