DFoly / User_log_pipeline
Creating a Streaming Pipeline for user log data in Google Cloud Platform
☆22Updated 5 years ago
Alternatives and similar repositories for User_log_pipeline:
Users that are interested in User_log_pipeline are comparing it to the libraries listed below
- My presentation at ODSC India 2018 about Deep Learning with Apache Spark☆27Updated 6 years ago
- Build end-to-end Machine Learning pipeline to predict accessibility of playgrounds in NYC☆15Updated 4 years ago
- AWS Big Data Certification☆25Updated 3 months ago
- An example PySpark project with pytest☆16Updated 7 years ago
- A code-based tutorial for production level data streaming with PySpark plus Optimus for data cleaning, Confluent Kafka, & Apache Drill u…☆26Updated 5 years ago
- Sample Notebooks for PipelineAI☆44Updated 2 years ago
- Ingest tweets with Kafka. Use Spark to track popular hashtags and trendsetters for each hashtag☆29Updated 9 years ago
- A simple introduction to using spark ml pipelines☆26Updated 7 years ago
- Workshop for Spark and Databricks☆54Updated 5 years ago
- pyspark sample scripts☆17Updated 6 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated 4 months ago
- Repository used for Spark Trainings☆53Updated 2 years ago
- Code examples for the Introduction to Kubeflow course☆14Updated 4 years ago
- Contains code and presentation for my interactive hack session, 'Effective Feature Engineering: A Structured Approach to Building Better …☆30Updated 4 years ago
- Collection of presentation of my work on various platforms and meetups☆22Updated 6 years ago
- ☆16Updated 2 years ago
- Example of orchestrating dependent Databricks jobs using Airflow☆11Updated 5 years ago
- Apache Spark Interview Question and Answers☆20Updated 4 years ago
- Mastering Spark for Data Science, published by Packt☆47Updated 2 years ago
- Analyzing NBA data using Spark 2.1☆46Updated 8 years ago
- Example custom model image trainable and distributable via AWS SageMaker☆35Updated last year
- Demonstration of using Apache Spark to build robust ETL pipelines while taking advantage of open source, general purpose cluster computin…☆24Updated last year
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Updated 7 years ago
- This repo demonstrates how to load a sample Parquet formatted file from an AWS S3 Bucket. A python job will then be submitted to a Apach…☆19Updated 8 years ago
- ☆20Updated 5 years ago
- code, labs and lectures for the course☆48Updated 2 years ago
- Projects from Udacity Data Streaming Nanodegree☆15Updated last year
- Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc☆51Updated 8 years ago
- Follow the Lumiata Tech Blog on Medium!☆21Updated 2 years ago
- Use Kafka and Apache Spark streaming to perform click stream analytics☆76Updated 5 years ago