ryandawsonuk / data-platforms-tools
Guide to data platforms and tools
☆32Updated 3 years ago
Alternatives and similar repositories for data-platforms-tools:
Users that are interested in data-platforms-tools are comparing it to the libraries listed below
- Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution☆19Updated 4 years ago
- Hadoop/Hive/Spark container to perform CI tests☆11Updated 4 years ago
- A curated list of awesome Databricks resources, including Spark☆17Updated 9 months ago
- Yet Another (Spark) ETL Framework☆20Updated last year
- Example project using DBT, Databricks and AdventureWorks sample database☆11Updated 2 years ago
- CICD pipeline that deploys a dbt image on a GKE cluster☆11Updated 3 years ago
- Delta Lake Documentation☆49Updated 10 months ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- Full stack data engineering tools and infrastructure set-up☆51Updated 4 years ago
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆19Updated 4 years ago
- Supplementary material for Building a Modern Data Platform with Snowflake, from Pearson.☆21Updated 3 years ago
- A bunch of hacks developed around dbt☆48Updated 5 years ago
- ☆11Updated last year
- Distributed Data Mesh 2.0 | DataMesh-as-a-Code on Cloud | Theory to Industrialization☆37Updated 2 years ago
- Sample code to collect Apache Iceberg metrics for table monitoring☆26Updated 8 months ago
- Events about the open source data stack☆13Updated 3 years ago
- ☆36Updated 2 years ago
- Data Engineering with Scala, published by Packt☆23Updated last year
- This repository contains NiFi processors for interacting with Snowflake Cloud Data Platform.☆12Updated 4 months ago
- PDF DataSource for Apache Spark, allow to read PDF files directly to the DataFrame and ocr it☆49Updated last week
- This repo contains the LookML for the model and dashboards used with the FHIR healthcare dataset to showcase how Looker can add value to …☆11Updated 2 years ago
- Cassandra + Spark = ❤️ Machine Learning with Apache Spark & Cassandra☆20Updated 3 years ago
- A CLI to manage and monitor permissions in AWS Lake Formation☆26Updated 2 years ago
- NiFi Processor for Apache Pulsar☆10Updated 5 months ago
- dbt / Amazon Redshift Demonstration Project☆34Updated 2 years ago
- Weekly Data Engineering Newsletter☆95Updated 9 months ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Updated last year
- Big Data Demystified meetup and blog examples☆31Updated 8 months ago
- AWS Quick Start Team☆18Updated 6 months ago
- Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and…☆28Updated 2 years ago