lowks / Airflow
AirFlow is a system to programmaticaly author, schedule and monitor data pipelines.
☆13Updated 10 years ago
Alternatives and similar repositories for Airflow
Users that are interested in Airflow are comparing it to the libraries listed below
Sorting:
- Data Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.☆110Updated 2 years ago
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆73Updated 5 years ago
- A plugin for Apache Airflow that allows you to manage the users that can login☆14Updated 5 years ago
- Hadoop Cluster Configurations☆32Updated 3 years ago
- A collection of Hive UDFs☆75Updated 5 years ago
- A light Kafka to HDFS/S3 ETL library based on Apache Spark☆41Updated 7 years ago
- A process that runs in unison with Apache Airflow to control the Scheduler process to ensure High Availability☆234Updated 2 years ago
- Simple python logging handler for forwarding logs to a kafka server☆30Updated 5 years ago
- SQL data model for working with Snowplow web data. Supports Redshift and Looker. Snowflake and BigQuery coming soon☆60Updated 4 years ago
- Airflow script for incremental data import from Mysql to Hive using Sqoop.☆18Updated 6 years ago
- An extension of the kafka-python package that adds features like multiprocess consumers.☆39Updated last year
- Hive UDFs for funnel analysis☆83Updated 2 years ago
- Airflow workflow management platform chef cookbook.☆71Updated 5 years ago
- A Getting Started Guide for developing and using Airflow Plugins☆93Updated 6 years ago
- docker for apache-atlas embedded-cassandra-solr☆23Updated 5 years ago
- Sample Airflow DAGs☆62Updated 2 years ago
- Provides a Pythonic interface for reading and writing Avro schemas☆27Updated 2 years ago
- Python DB-API client for Presto☆238Updated last year
- Exports hadoop metrics via HTTP for Prometheus consumption☆19Updated 4 years ago
- hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.☆28Updated 7 years ago
- This project is a unified ETL platform that support various data processing technologies, including Spark, Hive, Hadoop, Python, Linux Sh…☆17Updated 9 years ago
- Use Airflow to move data from multiple MySQL databases to BigQuery☆100Updated 4 years ago
- A tutorial on how to get started with Presto.☆56Updated 3 years ago
- Druid service descriptor and parcel for Cloudera CDH5☆32Updated 5 years ago
- Running Presto on k8s☆38Updated 5 years ago
- A schema store service that tracks and manages all the schemas used in the Data Pipeline☆87Updated 4 years ago
- Legoo: A collection of automation modules to build analytics infrastructure☆20Updated 4 years ago
- Python client for Hadoop® YARN API☆109Updated 2 years ago
- An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR☆174Updated last year
- Cask Hydrator Plugins Repository☆68Updated last week