DistrictDataLabs / entity-resolution
Tutorial code and data for the entity resolution workshops.
☆43Updated 9 years ago
Alternatives and similar repositories for entity-resolution:
Users that are interested in entity-resolution are comparing it to the libraries listed below
- Topic models (just LDA for now) on the Hacker News corpus☆22Updated 9 years ago
- Algorithms for "schema matching"☆26Updated 8 years ago
- A Topic Modeling toolbox☆92Updated 8 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- Python package aiding in entity disambiguation based on string and location matching☆18Updated last year
- ☆21Updated 8 years ago
- Collection of some algorithms for entity resolution☆28Updated 9 years ago
- Hidden alignment conditional random field for classifying string pairs.☆24Updated 5 months ago
- Multidimensional data explorer and visualization tool.☆56Updated 7 years ago
- A curated list of resources dedicated to text summarization☆55Updated 6 years ago
- Predict age and gender from a first name☆60Updated 6 years ago
- A small utility for converting Stanford GloVe vectors to HDF5 / NumPy☆12Updated 7 years ago
- code and slides for my PyGotham 2016 talk, "Higher-level Natural Language Processing with textacy"☆15Updated 8 years ago
- An in depth tutorial on sklearn's Pipeline and FeatureUnion classes.☆16Updated 7 years ago
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 4 years ago
- Scalable String Similarity Joins in Python☆38Updated 7 months ago
- NLP pipeline using word2vec (preprocessing/embedding/prediction/clustering)☆115Updated 10 months ago
- Data Server for Topic Models☆121Updated last year
- Sample repo for luigi tasks & config☆36Updated 8 years ago
- A Cython implementation of the affine gap string distance☆57Updated 2 years ago
- create a browser of a corpus using a topic model; original TMVE implementation (static pages)☆47Updated 9 years ago
- Word2Vec models with Twitter data using Spark. Blog:☆65Updated 6 years ago
- Library for Geo-Inferencing in Twitter Data☆28Updated 8 years ago
- ☆46Updated 2 weeks ago
- Tools and services for evaluating topic models☆15Updated 8 years ago
- Code base for representation learning of very short texts, such as tweets. By Cedric De Boom, IBCN, Ghent University, Belgium.☆37Updated 8 years ago
- Simplified tree-based classifier and regressor for interpretable machine learning (scikit-learn compatible)☆47Updated 4 years ago
- My machine learning model for the See Click Predict Fix Kaggle competition☆31Updated 7 years ago
- Reinforcement Learning Algorithms☆14Updated 6 years ago
- [development moved to termite-data-server]☆61Updated 11 years ago