KennethanCeyer / awesome-data-pipelineLinks
Awesome list for datapipeline
☆35Updated 2 years ago
Alternatives and similar repositories for awesome-data-pipeline
Users that are interested in awesome-data-pipeline are comparing it to the libraries listed below
Sorting:
- A curated list of awesome DataOps tools☆219Updated last month
- Full stack data engineering tools and infrastructure set-up☆57Updated 4 years ago
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆39Updated 3 years ago
- Awesome list of dataops products, open source and resources☆24Updated 3 years ago
- a collection of resources and blogs about Apache Superset☆88Updated 4 years ago
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆64Updated 2 years ago
- New generation opensource data stack☆76Updated 3 years ago
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆28Updated 8 months ago
- Apache Spark Guide☆35Updated 3 years ago
- A curated list of awesome blogs, videos, tools and resources about Data Contracts☆181Updated last year
- Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team …☆131Updated 2 months ago
- To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a…☆44Updated last year
- Spark data pipeline that processes movie ratings data.☆31Updated this week
- ☆78Updated this week
- dlt-dagster-demo☆13Updated 2 years ago
- Cloned by the `dbt init` task☆62Updated last year
- Auto-generated Diagrams from Airflow DAGs. 🔮 🪄☆354Updated this week
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆27Updated 3 years ago
- Open Data Stack Projects: Examples of End to End Data Engineering Projects☆91Updated 2 years ago
- A curated list of dagster code snippets for data engineers☆56Updated last year
- A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for …☆141Updated 5 years ago
- This is a repo with links to everything you'd ever want to learn about data engineering☆10Updated last year
- Data Tools Subjective List☆89Updated 2 years ago
- Stream/batch system with Hadoop, Spark on NYC taxi data | #DE☆26Updated 3 months ago
- Fivetran data models for QuickBooks using dbt.☆33Updated this week
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆78Updated this week
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Updated 3 years ago
- simplify working with DataHub API endpoints☆59Updated 9 months ago
- Repo for everything open table formats (Iceberg, Hudi, Delta Lake) and the overall Lakehouse architecture☆127Updated 2 months ago
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆67Updated 4 years ago