chen1649chenli / dataOpsResource
Awesome List for Data Operations
☆24Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for dataOpsResource
- Full stack data engineering tools and infrastructure set-up☆44Updated 3 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- Awesome list of dataops products, open source and resources☆24Updated 2 years ago
- This repo is an approach to TDD in machine learning model operation. it covers project structure, testing essentials using pytest with Gi…☆14Updated 3 years ago
- ☆29Updated 11 months ago
- Automated Jupyter notebook testing. 📙☆41Updated 9 months ago
- A collection of python utility functions☆12Updated 4 months ago
- Demo on how to use Prefect 2 in an ML project☆40Updated 2 years ago
- Code examples showing flow deployment to various types of infrastructure☆102Updated last year
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- A curated list of awesome DataOps tools☆158Updated last month
- Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟☆53Updated 2 years ago
- Metadata Driven Development (m3d) is a cloud and platform agnostic framework for the automated creation, management and governance of dat…☆31Updated last year
- Content for a talk on "The wonderful world of data quality tools in Python"☆18Updated 3 years ago
- Big Data Demystified meetup and blog examples☆31Updated 3 months ago
- A tool to automatically infer columns data types in .csv files☆34Updated last year
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆80Updated 6 months ago
- Code examples for the Introduction to Kubeflow course☆13Updated 3 years ago
- Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆78Updated 2 months ago
- Weekly Data Engineering Newsletter☆93Updated 4 months ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆111Updated 7 months ago
- PipeRider dbt workshop for DataTalksClub DE Zoomcamp☆16Updated 11 months ago
- A few end to end examples that use data-describe☆16Updated last year
- ☆25Updated 8 months ago
- Awesome Orchest projects, both official and submitted by the community.☆25Updated last year
- Set up a Cost-Effective Modern Data Stack for a Charity☆19Updated 8 months ago
- Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and…☆28Updated last year
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆53Updated 2 months ago
- Data-aware orchestration with dagster, dbt, and airbyte☆30Updated last year