KennethanCeyer / awesome-data-pipelineLinks
Awesome list for datapipeline
☆35Updated 2 years ago
Alternatives and similar repositories for awesome-data-pipeline
Users that are interested in awesome-data-pipeline are comparing it to the libraries listed below
Sorting:
- 📒(GitBook) A curated list of awesome Data Engineering resources☆36Updated this week
- A curated list of awesome DataOps tools☆196Updated 3 weeks ago
- ☆55Updated last week
- New generation opensource data stack☆70Updated 3 years ago
- A curated list of awesome blogs, videos, tools and resources about Data Contracts☆178Updated 11 months ago
- A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for …☆137Updated 5 years ago
- Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team …☆124Updated last week
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆59Updated last year
- Full stack data engineering tools and infrastructure set-up☆55Updated 4 years ago
- Apache Spark Guide☆31Updated 3 years ago
- Awesome list of dataops products, open source and resources☆24Updated 3 years ago
- A curated list of dagster code snippets for data engineers☆56Updated last year
- Open Data Stack Projects: Examples of End to End Data Engineering Projects☆86Updated 2 years ago
- A curated list of open source tools used in analytics platforms and data engineering ecosystem☆363Updated 4 months ago
- A portable Datamart and Business Intelligence suite built with Docker, Airflow, dbt, PostgreSQL and Superset☆45Updated 9 months ago
- How to build an awesome data engineering team☆100Updated 5 years ago
- Resources for video demonstrations and blog posts related to DataOps on AWS☆180Updated 3 years ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆28Updated 2 years ago
- a collection of resources and blogs about Apache Superset☆86Updated 3 years ago
- Open Source Data Quality Monitoring.☆158Updated 2 weeks ago
- Data Tools Subjective List☆86Updated last year
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆29Updated 2 months ago
- A repository of sample code to show data quality checking best practices using Airflow.☆78Updated 2 years ago
- This is a repo with links to everything you'd ever want to learn about data engineering☆11Updated 8 months ago
- Spark data pipeline that processes movie ratings data.☆29Updated last week
- Repo for everything open table formats (Iceberg, Hudi, Delta Lake) and the overall Lakehouse architecture☆92Updated last month
- ☆23Updated 4 years ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Updated 3 years ago
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆218Updated 3 months ago
- Docker Airflow - Contains a docker compose file for Airflow 2.0☆68Updated 2 years ago