wesmadrigal / GraphReduceLinks
Abstractions for feature engineering on large graphs of tabular data.
☆22Updated last week
Alternatives and similar repositories for GraphReduce
Users that are interested in GraphReduce are comparing it to the libraries listed below
Sorting:
- Pipeline components that support partial_fit.☆46Updated last year
- Record matching and entity resolution at scale in Spark☆35Updated last year
- An abstraction layer for parameter tuning☆35Updated last year
- Automatically export Jupyter notebooks to various file formats (.py, .html, and more) on save.☆81Updated last week
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆55Updated 2 months ago
- this repo might get accepted☆28Updated 4 years ago
- Buy Till You Die and Customer Lifetime Value statistical models in Python.☆117Updated last year
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies☆19Updated 2 years ago
- SciKIt-learn Pipeline in PAndas☆42Updated 2 years ago
- Distributed Bayesian Entity Resolution in Apache Spark☆57Updated 4 years ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆115Updated last month
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the same…☆29Updated 2 years ago
- A library of Reversible Data Transforms☆126Updated last week
- Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!☆137Updated last week
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆84Updated last year
- Tutorial for implementing data validation in data science pipelines☆33Updated 3 years ago
- Python package implementing transformers for pre processing steps for machine learning.☆65Updated last week
- Explore and compare 1K+ accurate decision trees in your browser!☆166Updated last year
- Supporting materials/code examples for my course in data engineering for machine learning.☆38Updated 2 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated last year
- 📊 Explain why metrics change by unpacking them☆39Updated 2 months ago
- ⚓ Eurybia monitors model drift over time and securizes model deployment with data validation☆213Updated 10 months ago
- Prune your sklearn models☆19Updated 10 months ago
- A tool for compiling trained SKLearn models into other representations (such as SQL, Sympy or Excel formulas)☆175Updated 2 years ago
- ☆27Updated 4 years ago
- Your favorite Python graph libraries, scalable and interoperable. Graph databases in memory, and familiar graph APIs for cloud databases.☆111Updated 3 months ago
- Type System for Data Analysis in Python☆213Updated 7 months ago
- Spark implementation of computing Shapley Values using monte-carlo approximation☆76Updated 2 years ago
- Python package for text mining of time-series data☆76Updated 4 months ago
- Automated Jupyter notebook testing. 📙☆41Updated last year