chen1649chenli / dataOpsResourceLinks
Awesome List for Data Operations
β24Updated 5 years ago
Alternatives and similar repositories for dataOpsResource
Users that are interested in dataOpsResource are comparing it to the libraries listed below
Sorting:
- manipulate pandas dataframes from the comfort of your browserβ174Updated 4 years ago
- Convert monolithic Jupyter notebooks π into maintainable Ploomber pipelines. πβ79Updated last year
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.β125Updated 4 years ago
- Full stack data engineering tools and infrastructure set-upβ57Updated 4 years ago
- Code examples showing flow deployment to various types of infrastructureβ111Updated 2 years ago
- β31Updated last year
- Bare bones use-case for deploying a containerized web app (built in streamlit) on AWS.β93Updated last year
- MLOps simplified. One-stop AI delivery platform, all the features you need.β103Updated this week
- A curated list of awesome DataOps toolsβ210Updated 4 months ago
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.β36Updated 6 years ago
- Python ELT Studio, an application for building ELT (and ETL) data flows.β58Updated 3 years ago
- Simple samples for writing ETL transform scripts in Pythonβ24Updated 3 months ago
- A collection of python utility functionsβ11Updated 2 weeks ago
- Swiple enables you to easily observe, understand, validate and improve the quality of your dataβ84Updated this week
- The easiest way to integrate Kedro and Great Expectationsβ54Updated 2 years ago
- Apache Spark Guideβ34Updated 3 years ago
- Open Source Data Quality Monitoring.β163Updated 3 weeks ago
- Automated Jupyter notebook testing. πβ41Updated last year
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withouβ¦β114Updated last week
- New generation opensource data stackβ75Updated 3 years ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.β81Updated last year
- real-time data + ML pipelineβ53Updated last month
- A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trinoβ92Updated this week
- A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics β¦β20Updated 4 years ago
- Anovos - An Open Source Library for Scalable feature engineering Using Apache-Sparkβ74Updated 2 years ago
- This article compares open-source Python packages for pipeline/workflow development: Airflow, Luigi, Gokart, Metaflow, Kedro, PipelineX.β57Updated 5 years ago
- ππ¨ Airflow tutorial for PyCon 2019β86Updated 2 years ago
- Template for data pipelines, ML workflows, API dev and monitoringβ44Updated last year
- Function decorators for Pandas Dataframe column name and data type validationβ19Updated 3 weeks ago
- Open Data Stack Projects: Examples of End to End Data Engineering Projectsβ91Updated 2 years ago