SvenskaSpel / cobra-policytool
Manage Apache Atlas and Ranger configuration for your Hadoop environment.
☆16Updated 3 years ago
Alternatives and similar repositories for cobra-policytool:
Users that are interested in cobra-policytool are comparing it to the libraries listed below
- Avro Schema Shredder is a REST API that enables storage of Avro Schemas in Apache Atlas. This API enables an organization to use Apache A…☆13Updated 8 years ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- A bridge to Apache Atlas for provenance metadata created in course of using Apache NiFi☆15Updated 2 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 8 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆61Updated 6 months ago
- ☆14Updated 8 years ago
- Schema Registry integration for Apache Spark☆40Updated 2 years ago
- Extensible streaming ingestion pipeline on top of Apache Spark☆44Updated last year
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆50Updated last year
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated last month
- Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline☆75Updated 2 years ago
- type-class based data cleansing library for Apache Spark SQL☆78Updated 5 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- Data Brewery is an ETL (Extract-Transform-Load) program that connect to many data sources (cloud services, databases, ...) and manage dat…☆16Updated 4 years ago
- A small project to show how to add lineage to Atlas when using Spark as ETL tool☆12Updated 8 years ago
- Spark-Radiant is Apache Spark Performance and Cost Optimizer☆25Updated 2 months ago
- Basic framework utilities to quickly start writing production ready Apache Spark applications☆35Updated 3 months ago
- UberScriptQuery, a SQL-like DSL to make writing Spark jobs super easy☆62Updated last year
- Apache Spark ETL Utilities☆40Updated 5 months ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 3 years ago
- Hadoop Data Pipeline using Falcon☆15Updated 8 years ago
- A Spark datasource for the HadoopOffice library☆38Updated 2 years ago
- A curated list of awesome PrestoDB / Trino software, libraries, tools and resources☆17Updated 3 years ago
- Sample processing code using Spark 2.1+ and Scala☆51Updated 4 years ago
- ☆10Updated 2 years ago
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆72Updated 4 years ago
- Code snippets used in demos recorded for the blog.☆30Updated last month
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆73Updated 5 years ago
- Demonstration of a Hive Input Format for Iceberg☆26Updated 4 years ago
- HDF masterclass materials☆28Updated 9 years ago