provectus / streaming-data-platform
☆24Updated 2 years ago
Alternatives and similar repositories for streaming-data-platform:
Users that are interested in streaming-data-platform are comparing it to the libraries listed below
- Reference Dockerfiles for production usage☆24Updated 5 years ago
- Swiss Army Kube (SAK) is an open-source IaC (Infrastructure as Code) collection of services for quick, easy, and controllable deployment …☆149Updated last year
- Data Quality Gate based on AWS☆57Updated 8 months ago
- 🚀 Deploy Kubeflow on AWS EKS with Terraform 🤖☆64Updated 2 years ago
- Airflow declarative DAGs via YAML☆132Updated last year
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated last month
- Minikube for big data with Scala and Spark☆15Updated 5 years ago
- ITSumma Spark Greenplum Connector☆36Updated last year
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆66Updated 3 years ago
- Data Engineering Digest☆27Updated 9 months ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- Deploy Presto on the cloud easily, using Terraform and Packer☆44Updated 2 years ago
- Spark ETL example processing New York taxi rides public dataset on EKS☆44Updated 2 years ago
- Streaming ETL with Apache Flink and Amazon Kinesis Data Analytics☆64Updated last year
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated last week
- ☆18Updated 3 years ago
- MLOps Platform☆270Updated 4 months ago
- This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS …☆19Updated 3 years ago
- JSON schema parser for Apache Spark☆81Updated 2 years ago
- ☆53Updated 7 months ago
- Easy CPU Profiling for Apache Spark applications☆45Updated 4 years ago
- Stores Snowplow enriched events in Redshift, Snowflake and Databricks☆31Updated 2 months ago
- Spark on Kubernetes infrastructure Docker images repo☆37Updated 2 years ago
- Repository of helm charts for deploying DataHub on a Kubernetes cluster☆178Updated this week
- A tool to create Airflow RBAC roles with dag-level permissions from cli.☆13Updated last year
- Amundsen Gremlin☆21Updated 2 years ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- KSQL Syntax Highlighting for VSCode☆17Updated 2 years ago
- Spark stream from kafka(json) to s3(parquet)☆15Updated 6 years ago
- Example Flink and Kafka integration project☆15Updated 9 years ago