KimaruThagna / ml-pipelines-airflow
Demonstrating and Building ML pipelines in Airflow
☆11Updated 3 years ago
Alternatives and similar repositories for ml-pipelines-airflow:
Users that are interested in ml-pipelines-airflow are comparing it to the libraries listed below
- Data Engineering with Scala, published by Packt☆23Updated last year
- ☆11Updated 3 years ago
- Cost Efficient Data Pipelines with DuckDB☆51Updated 8 months ago
- Repo for CDC with debezium blog post☆28Updated 6 months ago
- Full stack data engineering tools and infrastructure set-up☆50Updated 4 years ago
- Public source code for the Batch Processing with Apache Beam (Python) online course☆18Updated 4 years ago
- Content for a talk on "The wonderful world of data quality tools in Python"☆19Updated 3 years ago
- Snowflake Guide: Building a Recommendation Engine Using Snowflake & Amazon SageMaker☆31Updated 3 years ago
- Code for my "Efficient Data Processing in SQL" book.☆56Updated 7 months ago
- Ingesting data with Pulumi, AWS lambdas and Snowflake in a scalable, fully replayable manner☆71Updated 3 years ago
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 7 months ago
- Demo assets for DAIS 2021 'Learn to use Databricks for the full ML lifecycle' Talk☆13Updated 3 years ago
- ☆12Updated 3 years ago
- Utility functions for dbt projects running on Spark☆31Updated last month
- Generative AI in realtime with Confluent Cloud.☆22Updated 11 months ago
- ☆16Updated 8 months ago
- This repository contains recipes for Apache Pinot.☆30Updated last month
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆11Updated 10 months ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated last year
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"☆36Updated 8 months ago
- Duke MIDS: Data Engineering and DataOps Course☆65Updated 2 months ago
- ☆18Updated 3 years ago
- ☆17Updated 7 months ago
- Code that was used as an example during the Data+AI Summit 2020☆15Updated 4 years ago
- Building 3D Trusted Data Pipelines With Dagster, Dbt, and Duckdb☆20Updated last year
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆134Updated 2 months ago
- Scaling Machine Learning in Three Week course in a collaboration with O'Reilly following the guidance of Adi Polak's book - Scaling Machi…☆23Updated last year
- ☆21Updated last year
- Python implementation of Age-Partitioned Bloom Filter with S3 periodic backup support.☆11Updated 2 months ago
- Sample project that use Dagster, dbt, DuckDB and Dash to visualize car and motorcycle Spanish market☆57Updated 2 years ago