aws-quickstart / quickstart-datalake-cognizant-talend
☆10Updated this week
Related projects: ⓘ
- ☆16Updated this week
- A curated list of awesome Databricks resources, including Spark☆14Updated 2 months ago
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆18Updated 4 years ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆15Updated 7 months ago
- M3D Engine is a Spark application for the development of scalable data transformations and ingestions in data lakes.☆18Updated 3 years ago
- Profiles the data, validates the schema and runs data quality checks and produces a report☆20Updated 5 years ago
- ☆31Updated this week
- ☆34Updated last year
- A bunch of hacks developed around dbt☆48Updated 4 years ago
- Spark package for checking data quality☆25Updated last year
- Automation Framework for loading data in Snowflake into your Raw Vault☆23Updated 2 years ago
- Supplementary material for Building a Modern Data Platform with Snowflake, from Pearson.☆21Updated 2 years ago
- Examples for High Performance Spark☆15Updated 3 weeks ago
- Replication utility for AWS Glue Data Catalog☆73Updated last month
- ☆20Updated 3 years ago
- ☆23Updated 11 months ago
- AWS Quick Start Team☆18Updated 10 months ago
- ☆26Updated 4 years ago
- ☆19Updated this week
- Different ways to connect to storage in Azure Databricks☆10Updated 5 years ago
- Rules based grant management for Snowflake☆40Updated 5 years ago
- The open source version of the Amazon Redshift Cluster Management Guide.☆48Updated last year
- ☆12Updated this week
- ☆32Updated 3 months ago
- DIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics control framework that can be used to monitor, log, aud…☆25Updated this week
- A CLI to manage and monitor permissions in AWS Lake Formation☆25Updated last year
- Cloudformation and SQL scripts used to replicate a POC environment from the "Data Lake to Data Warehouse: Enhancing Customer 360 with Ama…☆30Updated 4 years ago
- Code Repository for GCP: Complete Google Data Engineer and Cloud Architect Guide(v), Published by Packt☆16Updated last year
- Building Json data pipeline within Snowflake using Streams and Tasks☆26Updated 4 years ago
- Reference Architectures for Datalakes on AWS☆76Updated 4 years ago