SvenskaSpel / cobra-policytool
Manage Apache Atlas and Ranger configuration for your Hadoop environment.
☆16Updated 3 years ago
Alternatives and similar repositories for cobra-policytool:
Users that are interested in cobra-policytool are comparing it to the libraries listed below
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- Schema Registry integration for Apache Spark☆40Updated 2 years ago
- A K8s-based infrastructure for analytics☆24Updated 5 years ago
- A bridge to Apache Atlas for provenance metadata created in course of using Apache NiFi☆15Updated 2 years ago
- Avro Schema Shredder is a REST API that enables storage of Avro Schemas in Apache Atlas. This API enables an organization to use Apache A…☆13Updated 8 years ago
- An Apache Spark app for making data movement between Apache Hive and Apache Phoenix/HBase☆14Updated 8 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆61Updated 4 months ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆86Updated 10 months ago
- HDF masterclass materials☆28Updated 8 years ago
- This repository is to help with the Partner Demonstration of the Apache Atlas project.☆30Updated 9 years ago
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆73Updated 5 years ago
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆49Updated last year
- Data Brewery is an ETL (Extract-Transform-Load) program that connect to many data sources (cloud services, databases, ...) and manage dat…☆16Updated 3 years ago
- Basic framework utilities to quickly start writing production ready Apache Spark applications☆35Updated last month
- A small project to show how to add lineage to Atlas when using Spark as ETL tool☆12Updated 8 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 8 years ago
- Sample processing code using Spark 2.1+ and Scala☆51Updated 4 years ago
- Hadoop Data Pipeline using Falcon☆15Updated 8 years ago
- An opinionated auto-deployer for the Hortonworks Platform☆34Updated 3 years ago
- This application comes as Spark2.1-as-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server☆49Updated last year
- Real-time anomaly detection using Kafka, KSQL User Defined Function and a pre-trained model☆30Updated last year
- Spark to Tableau Extractor library☆18Updated 7 years ago
- Airflow workflow management platform chef cookbook.☆69Updated 5 years ago
- A Spark metrics sink that pushes to InfluxDb☆51Updated 4 years ago
- JSON schema parser for Apache Spark☆81Updated 2 years ago
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- Ambari and Cloudera Manager in Docker☆22Updated 5 years ago
- ETLy is an add-on dashboard service on top of Apache Airflow.☆69Updated last year
- File compaction tool that runs on top of the Spark framework.☆59Updated 5 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago