ngmarchant / oasis
A Python package for efficient evaluation based on OASIS (Optimal Asymptotic Sequential Importance Sampling).
☆15Updated 3 years ago
Related projects: ⓘ
- A Cython implementation of the affine gap string distance☆58Updated last year
- Scalable String Similarity Joins in Python☆39Updated 2 months ago
- ☆38Updated this week
- Hidden alignment conditional random field for classifying string pairs.☆25Updated this week
- Distributed Bayesian Entity Resolution in Apache Spark☆57Updated 3 years ago
- Ensemble topic modelling with pLSA☆112Updated 2 years ago
- Algorithms for "schema matching"☆25Updated 8 years ago
- Matrix tools for building and inspecting latent spaces☆27Updated 6 years ago
- Set-oriented Operations in Pandas☆24Updated 4 years ago
- ☆46Updated this week
- A browser user interface for manual labeling of record pairs.☆41Updated last year
- Python library for Ceteris Paribus Plots (What-if plots)☆19Updated 3 years ago
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 3 years ago
- Datadiff is diff for data☆26Updated 4 years ago
- Venn diagrams with word clouds☆49Updated 4 months ago
- Fast hierarchical clustering routines for R and Python.☆134Updated 2 months ago
- Scikit-learn compatible Topic Modelling with Hierarchical Statistical Block Models (Gerlach, Peixoto and Altmann, 2018)☆28Updated 5 years ago
- Patsy Adaptors for Scikit-learn☆48Updated 5 years ago
- Scripts for ECML PKDD 2018 article: Similarity encoding for learning with dirty categorical variables☆11Updated 6 years ago
- A maximum-strength name parser for record linkage.☆29Updated last month
- Dask tutorial for PyData DC 2016☆11Updated 7 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated last year
- Multidimensional isotonic regression☆26Updated 7 years ago
- ☆32Updated 7 years ago
- Predict age and gender from a first name☆60Updated 5 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated last year
- This is a set of utilities and formats that illustrate how one could begin to perform operations on causal graphs and sample over these g…☆27Updated 9 years ago
- Simplified tree-based classifier and regressor for interpretable machine learning (scikit-learn compatible)☆46Updated 3 years ago
- a supplemental library for machine learning that incorporates various conveniences and functionalities either missing or not presently co…☆9Updated 6 years ago
- Genie: Fast and Robust Hierarchical Clustering with Noise Point Detection - in Python and R☆58Updated 3 weeks ago