argoproj / data-pipeline
☆36Updated 7 years ago
Alternatives and similar repositories for data-pipeline:
Users that are interested in data-pipeline are comparing it to the libraries listed below
- Deploy your Spark Production Cluster on Kubernetes☆47Updated 4 years ago
- Highly configurable Helm Presto Chart☆24Updated 5 years ago
- Airflow on Kubernetes Operator☆89Updated 2 years ago
- ☆34Updated 3 months ago
- ARCHIVED: Run Debezium/KafkaConnect CDC components in Kubernetes☆24Updated 6 years ago
- Export Airflow metrics (from mysql) in prometheus format☆29Updated 2 months ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆97Updated 2 years ago
- Setup Apache Airflow on Kubernetes☆10Updated 6 years ago
- Ansible role to install Apache Airflow☆81Updated last month
- An Integrated and collaborative cloud environment for building and running Spark applications on PKS/Kubernetes☆82Updated 5 years ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 3 years ago
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆73Updated 5 years ago
- Ansible roles to deploy Kubernetes, JupyterHub, Jupyter Enterprise Gateway and Spark on Kubernetes cluster☆39Updated 4 years ago
- 🚚 ETL for Spark and Airflow☆25Updated 7 years ago
- Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub☆37Updated 7 years ago
- ☆37Updated 5 years ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 4 years ago
- Kubernetes custom controller and CRDs to managing Airflow☆299Updated 4 years ago
- A tutorial on how to get started with Presto.☆56Updated 3 years ago
- [ARCHIVED] Moved to github.com/NVIDIA/spark-xgboost-examples☆70Updated 4 years ago
- ☆14Updated last month
- A K8s-based infrastructure for analytics☆24Updated 5 years ago
- Basic framework utilities to quickly start writing production ready Apache Spark applications☆36Updated 3 months ago
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated last month
- Metadata Driven Development (m3d) is a cloud and platform agnostic framework for the automated creation, management and governance of dat…☆31Updated last year
- A Github API client to extract events and actions, and load into a database☆28Updated 3 years ago
- Apache Airflow CI pipeline☆19Updated 5 years ago
- Package to extend Airflow functionality with CWL v1.0 support☆12Updated 5 years ago
- Fast iterative local development and testing of Apache Airflow workflows☆197Updated 3 months ago