provectus / streaming-data-platformLinks
☆24Updated 3 years ago
Alternatives and similar repositories for streaming-data-platform
Users that are interested in streaming-data-platform are comparing it to the libraries listed below
Sorting:
- Reference Dockerfiles for production usage☆24Updated 5 years ago
- Swiss Army Kube (SAK) is an open-source IaC (Infrastructure as Code) collection of services for quick, easy, and controllable deployment …☆150Updated last month
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆67Updated 3 years ago
- Airflow declarative DAGs via YAML☆133Updated 2 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆90Updated last year
- ITSumma Spark Greenplum Connector☆41Updated last year
- Setup for running Trino with Hive Metastore on Kubernetes☆103Updated 3 years ago
- Aiven's collection of Single Message Transformations (SMTs) for Apache Kafka Connect☆84Updated this week
- Oozie Workflow to Airflow DAGs migration tool☆88Updated 7 months ago
- A tool to create Airflow RBAC roles with dag-level permissions from cli.☆13Updated 2 years ago
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆70Updated last month
- ☆79Updated last year
- An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR☆175Updated 4 months ago
- Schema Registry☆17Updated last year
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Ambari stack service for installing and managing Apache Airflow on HDP cluster☆59Updated 6 years ago
- Grafana dashboards and StatsD exporter config for Airflow monitoring☆286Updated last year
- Common Transforms for Kafka Connect.☆167Updated 2 months ago
- Trino connectors for accessing APIs with an OpenAPI spec☆38Updated last week
- Data Quality Gate based on AWS☆57Updated last year
- 🚀 Deploy Kubeflow on AWS EKS with Terraform 🤖☆65Updated 2 years ago
- Multiple node presto cluster on docker container☆126Updated 3 years ago
- Spark on Kubernetes infrastructure Docker images repo☆38Updated 2 years ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆96Updated 3 weeks ago
- Stream Discovery and Stream Orchestration☆123Updated last month
- Experiments and demonstrations of AVRO, Protobuf serialisation☆61Updated 2 years ago
- Reference architecture for real-time stream processing with Apache Flink on Amazon EMR, Amazon Kinesis, and Amazon Elasticsearch Service.☆72Updated last year
- Repository of helm charts for deploying DataHub on a Kubernetes cluster☆194Updated this week
- ☆40Updated 2 years ago
- Spark ETL example processing New York taxi rides public dataset on EKS☆44Updated 2 years ago