GoogleCloudPlatform / oozie-to-airflowLinks
Oozie Workflow to Airflow DAGs migration tool
☆87Updated 3 months ago
Alternatives and similar repositories for oozie-to-airflow
Users that are interested in oozie-to-airflow are comparing it to the libraries listed below
Sorting:
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆88Updated last year
- Snowflake Data Source for Apache Spark.☆226Updated this week
- Task Metrics Explorer☆13Updated 6 years ago
- ☆14Updated 3 months ago
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- Magic to help Spark pipelines upgrade☆35Updated 8 months ago
- The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog a…☆222Updated 2 months ago
- Cask Hydrator Plugins Repository☆68Updated 3 weeks ago
- Astronomer Core Docker Images☆107Updated last year
- The Internals of Spark on Kubernetes☆71Updated 3 years ago
- Data ingestion library for Amundsen to build graph and search index☆205Updated last year
- DataQuality for BigData☆144Updated last year
- ☆80Updated last month
- A tool to validate data, built around Apache Spark.☆101Updated 3 weeks ago
- Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog☆35Updated last year
- type-class based data cleansing library for Apache Spark SQL☆78Updated 5 years ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Airflow Providers containing Deferrable Operators & Sensors from Astronomer☆148Updated this week
- Pylint plugin for static code analysis on Airflow code☆95Updated 4 years ago
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…☆89Updated 3 weeks ago
- Rocksdb state storage implementation for Structured Streaming.☆17Updated 4 years ago
- PySpark data-pipeline testing and CICD☆28Updated 4 years ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆59Updated last year
- Kinesis Connector for Structured Streaming☆136Updated 11 months ago
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆185Updated 2 years ago
- ETLy is an add-on dashboard service on top of Apache Airflow.☆69Updated last year
- Multiple node presto cluster on docker container☆124Updated 2 years ago
- ☆199Updated last year
- Metadata service library for Amundsen☆83Updated 2 weeks ago
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆61Updated 2 years ago