ant-laz / streamingworkshopLinks
Step by step development of a streaming pipeline in Python
☆13Updated 2 years ago
Alternatives and similar repositories for streamingworkshop
Users that are interested in streamingworkshop are comparing it to the libraries listed below
Sorting:
- ☆185Updated last week
- BigQuery DataFrames (also known as BigFrames)☆284Updated this week
- ☆121Updated 6 months ago
- An end to end demo of Google's Cloud data and analytic stack.☆279Updated this week
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆76Updated last year
- Repository for Beam College sessions☆112Updated 4 years ago
- ☆146Updated last year
- Source code accompanying: BigQuery: The Definitive Guide by Lakshmanan & Tigani to be published by O'Reilly Media☆554Updated last year
- Apache Beam Python examples and templates.☆14Updated 3 years ago
- Data Quality Engine for BigQuery☆278Updated 8 months ago
- Code for dbt tutorial☆167Updated 5 months ago
- Data Engineering on Google Cloud Platform☆380Updated last year
- An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer☆280Updated 6 months ago
- Code snippets for Data Engineering Design Patterns book☆331Updated last month
- ☆72Updated this week
- Cloud-native, data onboarding architecture for Google Cloud Datasets☆170Updated 2 weeks ago
- ☆284Updated last year
- Data Engineering with Google Cloud Platform, published by Packt☆120Updated 2 years ago
- BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.☆420Updated this week
- A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.☆80Updated 2 years ago
- Cost Efficient Data Pipelines with DuckDB☆61Updated 8 months ago
- Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.☆93Updated 2 years ago
- Simple stream processing pipeline☆110Updated last year
- Companion repository for the book 'Delta Lake Up and Running'☆48Updated 10 months ago
- ☆193Updated 4 years ago
- Code for "Efficient Data Processing in Spark" Course☆360Updated 3 months ago
- Dataproc templates and pipelines for solving in-cloud data tasks☆148Updated last week
- ☆130Updated last year
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆168Updated 2 years ago
- Fraudfinder: A comprehensive lab series on how to build a real-time fraud detection system on Google Cloud☆244Updated last month