jaumpedro214 / traffic-flow-spark-kafka
Testing Spark Structured Streaming anf Kafka with real data from traffic sensors
☆15Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for traffic-flow-spark-kafka
- A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from loc…☆21Updated 2 years ago
- Processing TfL data for bike usage with Google Cloud Platform.☆42Updated 2 years ago
- ☆36Updated last year
- Simple ETL pipeline using Python☆21Updated last year
- Delta-Lake, ETL, Spark, Airflow☆44Updated 2 years ago
- Data pipeline that scrapes Rust cheater Steam profiles☆51Updated 2 years ago
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆37Updated last year
- Kafka variant of the MLOps Level 1 stack☆21Updated 2 years ago
- ☆86Updated 2 years ago
- ☆38Updated 4 months ago
- Project for "Data pipeline design patterns" blog.☆41Updated 3 months ago
- ☆37Updated 4 years ago
- Design/Implement stream/batch architecture on NYC taxi data | #DE☆26Updated 3 years ago
- ☆32Updated last year
- RedditR for Content Engagement and Recommendation☆21Updated 6 years ago
- A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract d…☆24Updated 2 years ago
- In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our …☆23Updated 11 months ago
- Data Engineering examples for Airflow, Prefect, and Mage.ai; dbt for BigQuery, Redshift, ClickHouse, PostgreSQL; Spark/PySpark for Batch …☆51Updated last week
- End to end data engineering project with kafka, airflow, spark, postgres and docker.☆67Updated 3 months ago
- Simple stream processing pipeline☆92Updated 5 months ago
- An End-to-End ETL data pipeline that leverages pyspark parallel processing to process about 25 million rows of data coming from a SaaS ap…☆26Updated last year
- A Series of Notebooks on how to start with Kafka and Python☆153Updated last year
- Series follows learning from Apache Spark (PySpark) with quick tips and workaround for daily problems in hand☆42Updated last year
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as …☆16Updated 5 years ago
- End to end data engineering project☆51Updated 2 years ago
- Glue ETL job or EMR Spark that gets from data catalog, modifies and uploads to S3 and Data Catalog☆11Updated last year
- ☆11Updated 3 years ago
- This project aims to build a streaming application to perform real-time analytics of Covid-19 related tweets and deploy an ML model for r…☆12Updated 3 years ago
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆29Updated 4 years ago
- Unit testing using databricks connect☆30Updated 3 years ago