moj-analytical-services / splink_demosLinks
Interactive notebooks containing demonstration code of the splink library
☆38Updated last year
Alternatives and similar repositories for splink_demos
Users that are interested in splink_demos are comparing it to the libraries listed below
Sorting:
- Distributed Bayesian Entity Resolution in Apache Spark☆57Updated 4 years ago
- Record matching and entity resolution at scale in Spark☆34Updated last year
- ☆48Updated last year
- A browser user interface for manual labeling of record pairs.☆47Updated 2 years ago
- ☄️ Parallel and distributed training with spaCy and Ray☆54Updated last year
- A maximum-strength name parser for record linkage.☆37Updated 3 weeks ago
- NLP: An Application for Public Policy, PyCon Ireland 2018☆26Updated 2 years ago
- Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆73Updated 4 months ago
- Data Scientist code test☆19Updated 5 years ago
- An End-to-End Evaluation Framework for Entity Resolution Systems☆30Updated last year
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 3 years ago
- A python package to create a database on the platform using our moj data warehousing framework☆22Updated this week
- Build your feature store with macros right within your dbt repository☆39Updated 2 years ago
- Supporting materials/code examples for my course in data engineering for machine learning.☆38Updated 2 years ago
- Prototype search engine for ONS bulletins☆24Updated last year
- Automatically export Jupyter notebooks to various file formats (.py, .html, and more) on save.☆78Updated last year
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆282Updated 2 years ago
- Fast, flexible name matching for large datasets☆72Updated last month
- Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).☆14Updated 6 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- A Python package to build predictive linear and logistic regression models focused on performance and interpretation☆30Updated last year
- 🐍 Material for PyData Global 2021 Presentation: Effective Testing for Machine Learning Projects☆80Updated 3 years ago
- Buy Till You Die and Customer Lifetime Value statistical models in Python.☆117Updated last year
- Automated Exploratory Data Analysis. Simplifying Data Exploration☆36Updated 5 years ago
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.☆37Updated 6 years ago
- This project is wraper for Leilex, legal entity identifier API. Includes ISIN-LEI conversion. Search LEI number using company name.☆24Updated 9 months ago
- Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟☆53Updated 3 years ago
- Abstractions for feature engineering on large graphs of tabular data.☆21Updated last month
- A hands-on tutorial showing how to use Python to do anonymisation with synthetic data☆79Updated 3 years ago
- Sample projects using Ploomber.☆86Updated last year