airscholar / e2e-data-engineering

An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All components are containerized with Docker for easy deployment and scalability.
223Updated last year

Alternatives and similar repositories for e2e-data-engineering:

Users that are interested in e2e-data-engineering are comparing it to the libraries listed below