Minyus / Python_Packages_for_Pipeline_Workflow
This article compares open-source Python packages for pipeline/workflow development: Airflow, Luigi, Gokart, Metaflow, Kedro, PipelineX.
☆56Updated 4 years ago
Alternatives and similar repositories for Python_Packages_for_Pipeline_Workflow:
Users that are interested in Python_Packages_for_Pipeline_Workflow are comparing it to the libraries listed below
- Automatically export Jupyter notebooks to various file formats (.py, .html, and more) on save.☆75Updated last year
- PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more☆226Updated last year
- The easiest way to integrate Kedro and Great Expectations☆53Updated 2 years ago
- ☆43Updated 2 years ago
- 🐍 Material for PyData Global 2021 Presentation: Effective Testing for Machine Learning Projects☆81Updated 3 years ago
- Decorators that logs stats.☆108Updated last year
- Dockerized ML Cookiecutter☆71Updated 2 years ago
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 3 years ago
- A small python library that can clump lists of data together.☆148Updated 3 years ago
- Docker image for high-performance Machine Learning web applications. With Uvicorn managed by Gunicorn in Python 3.7 and 3.6, using Conda,…☆67Updated 2 years ago
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆83Updated last year
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆112Updated 10 months ago
- An abstraction layer for parameter tuning☆35Updated 5 months ago
- Automated Exploratory Data Analysis. Simplifying Data Exploration☆34Updated 4 years ago
- Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆78Updated 4 months ago
- A tool to deploy a mostly serverless MLflow tracking server on a GCP project with one command☆67Updated last year
- 🍦 Deployment tool for online machine learning models☆97Updated 2 years ago
- Bulwark is a package for convenient property-based testing of pandas dataframes.☆224Updated 4 years ago
- Summarise and explore Pandas DataFrames☆99Updated 4 years ago
- Bag of, not words, but tricks!☆68Updated last year
- A package for data science practitioners. This library implements a number of helpful, common data transformations with a scikit-learn fr…☆57Updated 3 years ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆80Updated 9 months ago
- Automated Data Science and Machine Learning library to optimize workflow.☆104Updated 2 years ago
- vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distr…☆121Updated last month
- ☄️ Parallel and distributed training with spaCy and Ray☆53Updated last year
- JupyterHub extension for ContainDS Dashboards☆202Updated 6 months ago
- Data Analysis Baseline Library☆130Updated 3 months ago
- 💫 PyScaffold extension for data-science projects☆156Updated last week
- Automated Jupyter notebook testing. 📙☆41Updated last year
- Useful decorators every Data Scientist should know☆29Updated 2 years ago