KennethanCeyer / awesome-data-pipelineLinks
Awesome list for datapipeline
☆35Updated 2 years ago
Alternatives and similar repositories for awesome-data-pipeline
Users that are interested in awesome-data-pipeline are comparing it to the libraries listed below
Sorting:
- A curated list of awesome DataOps tools☆207Updated 3 months ago
- Full stack data engineering tools and infrastructure set-up☆57Updated 4 years ago
- Awesome list of dataops products, open source and resources☆24Updated 3 years ago
- Open Data Stack Projects: Examples of End to End Data Engineering Projects☆89Updated 2 years ago
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆35Updated 3 years ago
- Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work☆47Updated 3 years ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆28Updated 2 years ago
- Repo for everything open table formats (Iceberg, Hudi, Delta Lake) and the overall Lakehouse architecture☆111Updated 4 months ago
- Apache Spark Guide☆34Updated 3 years ago
- New generation opensource data stack☆74Updated 3 years ago
- End to End Sales Streaming Pipeline (FastAPI, Kafka, Spark, Cassandra, MySQL, Superset)☆10Updated 2 years ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆48Updated last year
- Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team …☆128Updated 3 weeks ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Updated 3 years ago
- A curated list of dagster code snippets for data engineers☆56Updated last year
- A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for …☆139Updated 5 years ago
- A curated list of awesome Databricks resources, including Spark☆22Updated last year
- A list of free datasets that provide streaming data☆421Updated last year
- To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a…☆40Updated last year
- ☆70Updated last week
- A curated list of open source tools used in analytics platforms and data engineering ecosystem☆388Updated 7 months ago
- dlt-dagster-demo☆13Updated last year
- Auto-generated Diagrams from Airflow DAGs. 🔮 🪄☆349Updated last week
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆61Updated 2 years ago
- Spark data pipeline that processes movie ratings data.☆30Updated 3 weeks ago
- A curated list of open source alternatives for data analytics start-up products.☆57Updated 9 years ago
- A curated list of awesome blogs, videos, tools and resources about Data Contracts☆180Updated last year
- a collection of resources and blogs about Apache Superset☆88Updated 3 years ago
- Apache Airflow Guide☆30Updated last year
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.☆57Updated 3 years ago