newfront/hitchhikers_guide_to_deltalake_streaming

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/newfront/hitchhikers_guide_to_deltalake_streaming)

newfront / hitchhikers_guide_to_deltalake_streaming

Don't Panic. This guide will help you when it feels like the end of the world.

☆32

Alternatives and similar repositories for hitchhikers_guide_to_deltalake_streaming

Users that are interested in hitchhikers_guide_to_deltalake_streaming are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

bartosz25 / spark-playground
View on GitHub
Code snippets used in demos recorded for the blog.
☆42Apr 30, 2026Updated 2 months ago
bartosz25 / data-ai-summit-2024
View on GitHub
Visits sessionization pipeline used for the talk
☆13May 28, 2024Updated 2 years ago
delta-incubator / delta-lake-definitive-guide
View on GitHub
☆64Feb 1, 2025Updated last year
vincentsarago / conferences
View on GitHub
Talks, Meetup and Workshops
☆12Jun 4, 2024Updated 2 years ago
databrickslabs / pylint-plugin
View on GitHub
Databricks Plugin for PyLint
☆33Mar 27, 2026Updated 4 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Nike-Inc / spark-expectations
View on GitHub
A Python Library to support running data quality rules while the spark job is running⚡
☆201Jul 14, 2026Updated 2 weeks ago
newfront / spark-moderndataengineering
View on GitHub
The source code for the book Modern Data Engineering with Apache Spark
☆43Jul 26, 2022Updated 4 years ago
stikkireddy / dbtunnel-examples
View on GitHub
Examples of Using DBTunnel
☆11Apr 24, 2024Updated 2 years ago
bartosz25 / spark-docker
View on GitHub
Repository containing Docker images for Spark master and slave
☆15Nov 3, 2019Updated 6 years ago
kasun98 / datasystem
View on GitHub
End to end data pipeline
☆22Apr 13, 2025Updated last year
holdenk / spark-flowchart
View on GitHub
Flowchart for debugging Spark applications
☆104Sep 25, 2024Updated last year
CodyAustinDavis / edw-best-practices
View on GitHub
Git Repo for EDW Best Practice Assets on the Lakehouse
☆16Dec 11, 2023Updated 2 years ago
delta-io / delta-docker
View on GitHub
Official Dockerfile for Delta Lake
☆64Feb 24, 2026Updated 5 months ago
aws-samples / spark-streaming-sql-s3-connector
View on GitHub
An Apache Spark Structured Streaming S3 connector for reading S3 files using Amazon S3 event notifications to AWS SQS
☆16Feb 13, 2024Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
jaceklaskowski / learn-databricks
View on GitHub
Notebooks to learn Databricks Lakehouse Platform
☆46Updated this week
arrikto / learn-kubeflow
View on GitHub
Learn Kubeflow with Arrikto
☆15Jan 4, 2022Updated 4 years ago
andyweaves / system-tables-audit-logs
View on GitHub
SQL Queries & Alerts for Databricks System Tables access.audit Logs
☆50Jun 29, 2026Updated last month
aws-samples / mwaa-disaster-recovery
View on GitHub
Disaster recovery solution for Amazon Managed Workflows for Apache Airflow (MWAA)
☆12Apr 27, 2026Updated 3 months ago
Nike-Inc / brickflow
View on GitHub
Pythonic Programming Framework to orchestrate jobs in Databricks Workflow
☆228Jun 29, 2026Updated last month
lisancao / lakehouse-at-home
View on GitHub
A fully open-source, self-hostable data lakehouse for local development and testing of modern data workflows
☆107Jun 1, 2026Updated last month
sandonair007 / darknet
View on GitHub
Convolutional Neural Networks
☆12Oct 5, 2017Updated 8 years ago
anshul-musing / single-echelon-inventory-assessment
View on GitHub
inventory simulation modules for single-echelon supply chain
☆13Dec 25, 2018Updated 7 years ago
godatadriven / dbt-data-ai-summit
View on GitHub
Code that was used as an example during the Data+AI Summit 2020
☆15Mar 8, 2021Updated 5 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
bufbuild / registry-proto
View on GitHub
BSR's new public API. Currently in development.
☆22Jul 20, 2026Updated last week
rockthejvm / udemy-akka-persistence-starter
View on GitHub
The official Rock the JVM Akka Persistence Starter project
☆11Apr 4, 2019Updated 7 years ago
AstraBert / jake
View on GitHub
Make-like task executor for Unix OS
☆16Jul 8, 2026Updated 3 weeks ago
tiagotxm / yt-spark-no-kubernetes
View on GitHub
☆13Feb 19, 2025Updated last year
quixio / streaming-academy
View on GitHub
☆10Jul 24, 2024Updated 2 years ago
christianroman / df-gtfs
View on GitHub
Script para importar dataset de "df_gtfs" a PostgreSQL
☆13Jun 24, 2013Updated 13 years ago
aws-samples / aws-mwaa-openlineage
View on GitHub
In this repository, we show how to get started with data lineage on AWS using OpenLineage. This is an AWS Cloud Development Kit project (…
☆13Jul 25, 2024Updated 2 years ago
rockthejvm / udemy-akka-http
View on GitHub
For Udemy students: the official repository for the Rock the JVM Akka HTTP with Scala course
☆14Apr 27, 2022Updated 4 years ago
microsoft / planetary-computer-tasks
View on GitHub
PC Tasks: A framework for processing and ingesting data into the Planetary Computer
☆43Jun 28, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
PacktPublishing / Time-Series-Analysis-on-AWS
View on GitHub
Time series analysis on AWS, published by Packt
☆17Mar 2, 2026Updated 4 months ago
wricardo / grpcurl-mcp
View on GitHub
Model Context Protocol (MCP) server to interact with gRPC services using the grpcurl tool
☆17Mar 5, 2025Updated last year
substrait-io / substrait-validator
View on GitHub
☆15Jul 21, 2026Updated last week
yennanliu / utility_shell
View on GitHub
Collection of shell/Bash scripts for various using cases | #SE
☆11Jul 10, 2026Updated 2 weeks ago
souvik-databricks / dlt-with-debug
View on GitHub
A lightweight helper utility which allows developers to do interactive pipeline development by having a unified source code for both DLT …
☆50Dec 7, 2022Updated 3 years ago
AlexMercedCoder / Pangolin
View on GitHub
Pangolin is an Open-Source MIT Licensed Data Lakehouse Catalog in RUST with Iceberg REST Catalog Support
☆17Jan 2, 2026Updated 6 months ago
corriebar / statrethinking_reading_group
View on GitHub
Material for the Berlin Bayesian reading group covering Statistical Rethinking by Richard McElreath
☆10May 7, 2020Updated 6 years ago