HuemulSolutions / huemul-bigdatagovernanceLinks
Huemul BigDataGovernance, es una framework que trabaja sobre Spark, Hive y HDFS. Permite la implementación de una estrategia corporativa de dato único, basada en buenas prácticas de Gobierno de Datos. Permite implementar tablas con control de Primary Key y Foreing Key al insertar y actualizar datos utilizando la librería, Validación de nulos, l …
☆11Updated 2 years ago
Alternatives and similar repositories for huemul-bigdatagovernance
Users that are interested in huemul-bigdatagovernance are comparing it to the libraries listed below
Sorting:
- ☆14Updated 8 months ago
- ☆18Updated 6 years ago
- Data Lineage Tracking And Visualization Solution☆635Updated this week
- ☆40Updated 4 years ago
- Food for thoughts around data contracts☆27Updated last week
- PDF DataSource for Apache Spark, allow to read PDF files directly to the DataFrame and ocr it☆70Updated 2 months ago
- Create HTML profiling reports from Apache Spark DataFrames☆196Updated 5 years ago
- New Generation Opensource Data Stack Demo☆438Updated 2 years ago
- Python API for Deequ☆788Updated 3 months ago
- Spark fires is a anti-pattern playground where we deliberately break Spark applications in various ways so you can observe what happens a…☆42Updated 8 months ago
- Databricks Implementation of the TPC-DI Specification using Traditional Notebooks and/or Delta Live Tables☆87Updated 2 weeks ago
- MLflow samples - deprecated☆22Updated 2 years ago
- Generate and Visualize Data Lineage from query history☆326Updated last year
- Tool to automate data quality checks on data pipelines☆255Updated 2 years ago
- PySpark test helper methods with beautiful error messages☆706Updated 2 weeks ago
- An open protocol for secure data sharing☆851Updated last month
- Delta Lake helper methods in PySpark☆324Updated 10 months ago
- A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.☆185Updated 2 years ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆256Updated last week
- A Micosoft Power BI Custom Connector allowing you to import Trino data into Power BI.☆73Updated 6 months ago
- A dbt adapter for oracle db backend☆36Updated 3 years ago
- dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks☆440Updated last week
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆169Updated last year
- Delta Lake examples☆227Updated 9 months ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆675Updated 4 months ago
- A portable Datamart and Business Intelligence suite built with Docker, Dagster, dbt, DuckDB and Superset☆235Updated 5 months ago
- This repo via a real world use case, shows how to launch dbt models from a DAG in Apache Airflow.☆12Updated 3 months ago
- Testing framework for Databricks notebooks☆305Updated last year
- Python API for Deequ☆40Updated 4 years ago
- Examples of metadata driven SQL processes implemented in Databricks☆16Updated 4 years ago