provectus / streaming-data-platform
☆24Updated 2 years ago
Related projects: ⓘ
- Reference Dockerfiles for production usage☆24Updated 4 years ago
- Swiss Army Kube (SAK) is an open-source IaC (Infrastructure as Code) collection of services for quick, easy, and controllable deployment …☆147Updated last year
- Data Quality Gate based on AWS☆55Updated 2 months ago
- 🚀 Deploy Kubeflow on AWS EKS with Terraform 🤖☆64Updated last year
- ITSumma Spark Greenplum Connector☆34Updated 5 months ago
- A tool to create Airflow RBAC roles with dag-level permissions from cli.☆13Updated last year
- Airflow declarative DAGs via YAML☆131Updated last year
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated 6 months ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆86Updated 6 months ago
- Data validation library for PySpark 3.0.0☆34Updated last year
- Spark on Kubernetes infrastructure Docker images repo☆37Updated last year
- Deploy Presto on the cloud easily, using Terraform and Packer☆44Updated last year
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 2 years ago
- Data Engineering Digest☆27Updated 2 months ago
- MLOps Platform☆271Updated 2 weeks ago
- JSON schema parser for Apache Spark☆81Updated 2 years ago
- A K8s-based infrastructure for analytics☆24Updated 4 years ago
- ☆77Updated last year
- Kafka Connector for Iceberg tables☆16Updated last year
- ☆63Updated 4 years ago
- ☆18Updated 2 years ago
- ☆16Updated last year
- Spark and Hive docker containers sharing a common MySQL metastore☆26Updated 4 years ago
- Terraform / NiFi on the Google Cloud Platform☆28Updated 3 months ago
- ODD Specification is a universal open standard for collecting metadata.☆128Updated 2 months ago
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆65Updated 2 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 7 years ago
- Minikube for big data with Scala and Spark☆15Updated 4 years ago
- Code to be contributed to the Apache Airflow (incubating) project for ETL workflow management for integrating with the Snowflake Data War…☆25Updated 7 years ago
- JUnit integration for testing the Apache Hive Metastore and HiveServer2 Thrift APIs☆24Updated 6 months ago