ryandawsonuk / data-platforms-tools
Guide to data platforms and tools
☆32Updated 3 years ago
Alternatives and similar repositories for data-platforms-tools
Users that are interested in data-platforms-tools are comparing it to the libraries listed below
Sorting:
- Distributed Data Mesh 2.0 | DataMesh-as-a-Code on Cloud | Theory to Industrialization☆38Updated 2 years ago
- Yet Another (Spark) ETL Framework☆21Updated last year
- Hadoop/Hive/Spark container to perform CI tests☆11Updated 4 years ago
- ☆42Updated 4 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- Full stack data engineering tools and infrastructure set-up☆52Updated 4 years ago
- Utility functions for dbt projects running on Spark☆34Updated 3 months ago
- Receipes of publicly-available Jupyter images☆8Updated 2 months ago
- Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and…☆28Updated 2 years ago
- ☆13Updated last year
- A Table format agnostic data sharing framework☆38Updated last year
- ☆34Updated last week
- FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...☆21Updated this week
- Sample configuration to deploy a modern data platform.☆88Updated 3 years ago
- A curated list of awesome Databricks resources, including Spark☆18Updated 10 months ago
- Data Mesh Architecture☆78Updated 10 months ago
- Intended for internal use: deploys all infrastructure required for Astronomer to run on GCP☆10Updated last week
- A kind data platform on your local machine. 🤗☆10Updated 2 weeks ago
- ☆11Updated last year
- Sample code to collect Apache Iceberg metrics for table monitoring☆27Updated 8 months ago
- Faker for Snowflake!☆33Updated 2 years ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- AWS Quick Start Team☆18Updated 7 months ago
- Project files for the post: Running PySpark Applications on Amazon EMR: Methods for Interacting with PySpark on Amazon Elastic MapReduce.☆38Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆29Updated 2 weeks ago
- Big Data Demystified meetup and blog examples☆31Updated 9 months ago
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- An implementation of the DatasourceV2 interface of Apache Spark™ for writing Spark Datasets to Apache Druid™.☆41Updated 3 weeks ago
- Pipeline library for StreamSets Data Collector and Transformer☆33Updated 2 years ago
- Example project using DBT, Databricks and AdventureWorks sample database☆11Updated 2 years ago