DistrictDataLabs / entity-resolution
Tutorial code and data for the entity resolution workshops.
☆45Updated 9 years ago
Related projects ⓘ
Alternatives and complementary repositories for entity-resolution
- Algorithms for "schema matching"☆25Updated 8 years ago
- A small utility for converting Stanford GloVe vectors to HDF5 / NumPy☆12Updated 7 years ago
- Collection of some algorithms for entity resolution☆28Updated 9 years ago
- Multidimensional data explorer and visualization tool.☆52Updated 7 years ago
- code and slides for my PyGotham 2016 talk, "Higher-level Natural Language Processing with textacy"☆15Updated 8 years ago
- My machine learning model for the See Click Predict Fix Kaggle competition☆31Updated 7 years ago
- A Topic Modeling toolbox☆93Updated 8 years ago
- Word2Vec models with Twitter data using Spark. Blog:☆65Updated 5 years ago
- Common data science and data engineering utilities to help us perform analytics. Our toolbox for data scientists, licensed under Apache-2…☆30Updated 6 years ago
- A guide on extracting entities from raw text in order to conduct social network analysis.☆20Updated 7 years ago
- Reinforcement Learning Algorithms☆14Updated 6 years ago
- Slides for my doc2vec workshop/talk☆29Updated 7 years ago
- deep entity resolution lite version☆11Updated 5 years ago
- A Cython implementation of the affine gap string distance☆58Updated last year
- Sample repo for luigi tasks & config☆36Updated 8 years ago
- Hidden alignment conditional random field for classifying string pairs.☆25Updated last month
- Code base for representation learning of very short texts, such as tweets. By Cedric De Boom, IBCN, Ghent University, Belgium.☆37Updated 8 years ago
- Feature Engineering with Pipeline Talk at ODSC West 2016, Santa Clara☆17Updated 8 years ago
- Library for Geo-Inferencing in Twitter Data☆28Updated 8 years ago
- NOTE: skutil is now deprecated. See its sister project: https://github.com/tgsmith61591/skoot. Original description: A set of scikit-lear…☆30Updated 6 years ago
- Text Preprocessing in Python☆19Updated 7 years ago
- This project focuses on DeepER, a deep learning framework for entity resolution (record deduplication). It examines how DeepER performs o…☆45Updated 6 years ago
- A curated list of resources dedicated to text summarization☆55Updated 6 years ago
- Demo code for learning_text_transformer☆25Updated 9 years ago
- An automated ingestion service for blogs to construct a corpus for NLP research.☆86Updated 6 years ago
- Natural Language Processing with Spark's MLlib☆62Updated 7 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated last year
- Tutorial repo for the article "ML in Production"☆30Updated last year
- Relatively simple text classification powered by spaCy☆42Updated 9 years ago
- A repository for the "Combining DBpedia and Topic Modeling" GSoC 2016 idea☆13Updated 8 years ago