GoogleCloudPlatform / dataproc-scala-examplesLinks
Dataproc Scala Examples is an effort to assist in the creation of Spark jobs written in Scala to run on Dataproc.
☆12Updated last year
Alternatives and similar repositories for dataproc-scala-examples
Users that are interested in dataproc-scala-examples are comparing it to the libraries listed below
Sorting:
- Course notes for the Astronomer Certification DAG Authoring for Apache Airflow☆56Updated last year
- The resources of the preparation course for Databricks Data Engineer Associate certification exam☆551Updated last week
- The resources of the preparation course for Databricks Data Engineer Professional certification exam☆160Updated last week
- A self-contained dbt project for testing purposes☆513Updated last year
- Supplementary Materials for the The Complete dbt (Data Build Tool) Bootcamp Udemy course☆724Updated last week
- Docker with Airflow and Spark standalone cluster☆262Updated 2 years ago
- ☆21Updated 2 years ago
- Code snippets for Data Engineering Design Patterns book☆296Updated last week
- Code for "Efficient Data Processing in Spark" Course☆350Updated 2 months ago
- Practice your Pyspark skills!☆97Updated 4 years ago
- This repo contains "Databricks Certified Data Engineer Professional" Questions and related docs.☆125Updated last year
- Simple repo to demonstrate how to submit a spark job to EMR from Airflow☆34Updated 5 years ago
- Stream processing pipeline from Finnhub websocket using Spark, Kafka, Kubernetes and more☆370Updated 2 years ago
- Study Notes for the Snowflake SnowPro Core Certification Exam☆105Updated 6 months ago
- Sample project to demonstrate data engineering best practices☆204Updated last year
- Project utilising data from the Age of Empires api at 'https://aoestats.io'☆54Updated last year
- Template for a data contract used in a data mesh.☆486Updated last year
- A repository of sample code to accompany our blog post on Airflow and dbt.☆182Updated 2 years ago
- Dataproc templates and pipelines for solving in-cloud data tasks☆140Updated last week
- Demo DAGs that show how to run dbt Core in Airflow using Cosmos☆65Updated 7 months ago
- This project leverages GCS, Composer, Dataflow, BigQuery, and Looker on Google Cloud Platform (GCP) to build a robust data engineering so…☆31Updated 2 years ago
- Run your dbt Core or dbt Fusion projects as Apache Airflow DAGs and Task Groups with a few lines of code☆1,106Updated this week
- A template repository to create a data project with IAC, CI/CD, Data migrations, & testing☆281Updated last year
- Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard. The dashboa…☆243Updated 2 years ago
- This repository goes over how to handle massive variety in data engineering☆308Updated 2 years ago
- Apartments Data Pipeline using Airflow and Spark.☆23Updated 3 years ago
- Code for Data Pipelines with Apache Airflow☆811Updated last year
- Fivetran provider for Airflow☆29Updated 2 years ago
- 🥪🦘 An open source sandbox project exploring dbt workflows via a fictional sandwich shop's data.☆238Updated last week
- Main repository to collect notes and scripts written during DataExpert.IO January 2025 bootcamp to help anyone interested.☆34Updated 8 months ago