chen1649chenli / dataOpsResource
Awesome List for Data Operations
☆21Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for dataOpsResource
- Full stack data engineering tools and infrastructure set-up☆41Updated 3 years ago
- Big Data Demystified meetup and blog examples☆31Updated 2 months ago
- Sample configuration to deploy a modern data platform.☆86Updated 2 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics …☆18Updated 2 years ago
- Simple samples for writing ETL transform scripts in Python☆22Updated 3 years ago
- ☆11Updated 2 years ago
- Data-aware orchestration with dagster, dbt, and airbyte☆30Updated last year
- New generation opensource data stack☆61Updated 2 years ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆122Updated 3 years ago
- CLI for data platform☆19Updated 11 months ago
- A series of workshop modules introducing Feast feature store.☆19Updated 2 years ago
- PipeRider dbt workshop for DataTalksClub DE Zoomcamp☆16Updated 11 months ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆24Updated last year
- ☆29Updated 10 months ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- Public source code for the Batch Processing with Apache Beam (Python) online course☆19Updated 4 years ago
- Awesome list of dataops products, open source and resources☆24Updated 2 years ago
- A collection of python utility functions☆12Updated 4 months ago
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆61Updated last year
- A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in …☆21Updated 2 years ago
- MLflow-tracking server example with Minio and H2O☆18Updated 5 years ago
- Curated list of resources about Apache Airflow☆19Updated 3 years ago
- real-time data + ML pipeline☆54Updated last week
- ☆90Updated last year
- Code examples for the Introduction to Kubeflow course☆13Updated 3 years ago