mllite / sklearn2sql-demoLinks
Demo of an In-database processing tool for scikit-learn
☆13Updated 3 years ago
Alternatives and similar repositories for sklearn2sql-demo
Users that are interested in sklearn2sql-demo are comparing it to the libraries listed below
Sorting:
- Build your feature store with macros right within your dbt repository☆39Updated 3 years ago
- ☆41Updated last year
- Abstractions for feature engineering on large graphs of tabular data.☆24Updated 2 months ago
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies☆21Updated 3 years ago
- Record matching and entity resolution at scale in Spark☆36Updated 2 years ago
- Set of iPython and Jupyter extensions to improve user experience☆50Updated 6 years ago
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the same…☆30Updated 3 years ago
- A tool for compiling trained SKLearn models into other representations (such as SQL, Sympy or Excel formulas)☆176Updated 3 years ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable fro…☆29Updated 3 years ago
- @vega transforms with @ibis-project expressions☆29Updated 4 years ago
- Python package for deduplication/entity resolution using active learning☆83Updated last year
- Pipeline components that support partial_fit.☆46Updated last year
- SnowShu is a sampling engine designed to support testing in data development.☆12Updated 5 months ago
- Fake Pandas / PySpark DataFrame creator☆48Updated last year
- Pandas helper functions☆31Updated 2 years ago
- Functional Airflow DAG definitions.☆38Updated 8 years ago
- Tutorial for implementing data validation in data science pipelines☆33Updated 3 years ago
- Supporting materials/code examples for my course in data engineering for machine learning.☆39Updated 3 years ago
- A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino☆93Updated this week
- Python wrapper for a C++ Double Metaphone☆15Updated 3 weeks ago
- Marshmallow Schema generator for Pandas DataFrames☆24Updated 5 years ago
- ☆40Updated 2 years ago
- Set-oriented Operations in Pandas☆24Updated 5 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated 2 months ago
- Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆79Updated last year
- Dask integration for Snowflake☆30Updated 6 months ago
- python library for automated dataset normalization☆117Updated 2 years ago
- Woodwork is a Python library that provides robust methods for managing and communicating data typing information.☆155Updated 4 months ago
- quadipy is a python package to help transform structured data into RDF graph format☆19Updated 2 years ago
- A maximum-strength name parser for record linkage.☆39Updated 5 months ago