UrbanOS-Public / kdp
Kubernetes deployment of PrestoDB, Hive Metastore, and Minio S3-standard object store
☆17Updated 2 years ago
Alternatives and similar repositories for kdp:
Users that are interested in kdp are comparing it to the libraries listed below
- Kubernetes (K8s) Operator for PrestoDB☆46Updated 3 years ago
- Helm Chart for lyft/flinkk8soperator☆11Updated 4 years ago
- Presto K8S Operator☆9Updated 4 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 8 years ago
- An Integrated and collaborative cloud environment for building and running Spark applications on PKS/Kubernetes☆82Updated 4 years ago
- Schema Registry integration for Apache Spark☆40Updated 2 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆88Updated 11 months ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 4 years ago
- A kubernetes CRD and controller to manage Flink jobs running on your any Flink Job Manager☆8Updated 2 months ago
- ☆40Updated last year
- Setup for running Trino with Hive Metastore on Kubernetes☆99Updated 2 years ago
- Rocksdb state storage implementation for Structured Streaming.☆17Updated 4 years ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆97Updated 2 years ago
- Airflow on Kubernetes Operator☆89Updated 2 years ago
- ☆39Updated 5 years ago
- Data Sketches for Apache Spark☆22Updated 2 years ago
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆73Updated 5 years ago
- The Internals of Spark on Kubernetes☆70Updated 2 years ago
- Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub☆37Updated 7 years ago
- HDFS Automatic Snapshot Service for Linux☆12Updated 8 years ago
- Kubernetes Helm Chart to deploy Apache Atlas☆15Updated 4 years ago
- ☆37Updated 5 years ago
- Spark to Tableau Extractor library☆18Updated 7 years ago
- Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol☆34Updated 2 years ago
- Performance optimization for Spark running on Kubernetes☆86Updated 4 years ago
- Spark on Kubernetes infrastructure Docker images repo☆37Updated 2 years ago
- Ansible playbooks to construct distributed computing environments☆62Updated 3 years ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated 10 months ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆58Updated last year
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated 11 months ago