vincentnam / docker_datalakeLinks
Datalake
☆31Updated last week
Alternatives and similar repositories for docker_datalake
Users that are interested in docker_datalake are comparing it to the libraries listed below
Sorting:
- EverythingApacheNiFi☆115Updated 2 years ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆48Updated last year
- Infraestructura para Big Data : Hadoop + NiFi +Spark + Hive usando Docker☆20Updated 2 years ago
- New generation opensource data stack☆74Updated 3 years ago
- Zeppelin docker☆16Updated 4 years ago
- Build Data Lake using Open Source tools☆115Updated 5 months ago
- Main TDP repository☆58Updated last month
- Repository for Docker Image of Apache-Superset. [Docker Image: https://hub.docker.com/r/abhioncbr/docker-superset]☆104Updated 4 years ago
- Playground site for creating/validating data contracts☆10Updated 2 months ago
- apache-nifi-templates☆53Updated 4 years ago
- Deploy multiple Dagster data pipelines on Docker environment☆23Updated last year
- Data Engineering Projects using Mage.ai as orchestrator☆16Updated 5 months ago
- Repository for building docker image, with open-source applications☆26Updated last year
- One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)☆120Updated 4 years ago
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆495Updated 2 years ago
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 3 years ago
- Full stack data engineering tools and infrastructure set-up☆57Updated 4 years ago
- KNIME Analytics Platform & SDK with Docker Container in X11 desktop☆25Updated 3 years ago
- MonitoFi: Health & Performance Monitor for your Apache NiFi☆66Updated 2 years ago
- Template for building FastAPI applications with Neo4j(neontology).☆20Updated 2 months ago
- This project demonstrates how to build and automate an ETL pipeline using DAGs in Airflow and load the transformed data to Bigquery. Ther…☆24Updated 2 months ago
- Collection of assets used for various articles at https://blogs.min.io☆40Updated 7 months ago
- A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in …☆23Updated 3 years ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆170Updated 2 weeks ago
- Data Mesh Manager (Community Edition)☆48Updated last week
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆28Updated 5 months ago
- Dremio Container Tools☆162Updated 2 months ago
- Tutorial for setting up a Spark cluster running inside of Docker containers located on different machines☆134Updated 2 years ago
- Witboost is a versatile platform that addresses a wide range of sophisticated data engineering challenges. The Starter Kit showcases the …☆25Updated last month
- This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and …☆34Updated last year