rogeriomm / labtools-k8sLinks

Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,Airflow, Kafka Strimzi, Datahub, OpenMetadata,Zeppelin, Jupyter, JFrog Container Registry

☆28

Alternatives and similar repositories for labtools-k8s

Users that are interested in labtools-k8s are comparing it to the libraries listed below

Sorting:

ssp-data / data-engineering-devops
Full stack data engineering tools and infrastructure set-up
☆56Updated 4 years ago
Guilherme-Silveira / bigdata-docker
☆23Updated 2 years ago
kaxil / airflowctl
A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects
☆223Updated 5 months ago
Data-Engineer-Camp / modern-elt-demo
A modern ELT demo using airbyte, dbt, snowflake and dagster
☆28Updated 2 years ago
arezamoosavi / AcidOnSpark-ETL
Delta-Lake, ETL, Spark, Airflow
☆48Updated 3 years ago
airbytehq / open-data-stack
Open Data Stack Projects: Examples of End to End Data Engineering Projects
☆89Updated 2 years ago
dbt-labs / spark-utils
Utility functions for dbt projects running on Spark
☆33Updated 8 months ago
astronomer / cosmos-demo
Demo DAGs that show how to run dbt Core in Airflow using Cosmos
☆64Updated 5 months ago
DataKitchen / data-observability-installer
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team …
☆127Updated last week
bgarcevic / danish-democracy-data
☆38Updated 7 months ago
borjavb / dbt-iceberg-poc
☆80Updated last year
adidas / lakehouse-engine
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…
☆268Updated last week
konosp / dbt-airflow-docker-compose
Execution of DBT models using Apache Airflow through Docker Compose
☆121Updated 2 years ago
astronomer / airflow-data-quality-demo
A repository of sample code to show data quality checking best practices using Airflow.
☆78Updated 2 years ago
garystafford / tickit-data-lake-demo
Resources for video demonstrations and blog posts related to DataOps on AWS
☆182Updated 3 years ago
guidok91 / spark-movies-etl
Spark data pipeline that processes movie ratings data.
☆30Updated 2 weeks ago
Aiven-Labs / python-fake-data-producer-for-apache-kafka
The Python fake data producer for Apache Kafka® is a complete demo app allowing you to quickly produce JSON fake streaming datasets and …
☆85Updated last year
sodadata / soda-spark
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
☆64Updated 3 years ago
zsvoboda / ngods
New generation opensource data stack
☆73Updated 3 years ago
ysfesr / Building-Data-LakeHouse
Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data
☆48Updated last year
techindicium / dbt-dag-monitoring
dbt package for monitoring airflow DAGs and tasks
☆29Updated 8 months ago
rajagurunath / lakehouse-sharing
A Table format agnostic data sharing framework
☆39Updated last year
Stefen-Taime / Iceberg-Dbt-Trino-Hive-modern-open-source-data-stack
To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a…
☆40Updated last year
DataKitchen / dataops-testgen
DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data qualit…
☆64Updated this week
delta-io / delta-examples
Delta Lake examples
☆229Updated last year
luanmorenomaciel / big-data-on-k8s
☆21Updated 3 years ago
gmyrianthous / dbt-airflow
A Python package that creates fine-grained dbt tasks on Apache Airflow
☆74Updated this week
astronomer / airflow-dbt-demo
A repository of sample code to accompany our blog post on Airflow and dbt.
☆178Updated 2 years ago
delta-io / delta-docs
Delta Lake Documentation
☆50Updated last year
Data-Engineer-Camp / dbt-dimensional-modelling
Step-by-step tutorial on building a Kimball dimensional model with dbt
☆150Updated last year