hiejulia / Data-pipeline-project
Data pipeline project
☆25Updated last year
Alternatives and similar repositories for Data-pipeline-project:
Users that are interested in Data-pipeline-project are comparing it to the libraries listed below
- Apche Spark Structured Streaming with Kafka using Python(PySpark)☆41Updated 5 years ago
- Apache Spark Interview Question and Answers☆21Updated 4 years ago
- Apache Spark 3 - Structured Streaming Course Material☆121Updated last year
- This repository implements a real-time credit card fraud detection pipeline using Kafka, Spark and Cassandra. Kafka continuously produces…☆16Updated 3 years ago
- This repository hosts the code/projects/demos/slides for Big Data technologies under Apache Hadoop and Apache Spark umbrella.☆42Updated 2 years ago
- All my projects on Big Data are provided☆27Updated 8 years ago
- data engineering 100 days 🤖 🧲 🦾 | #DE☆40Updated last year
- ☆148Updated 6 years ago
- How to build an awesome data engineering team☆99Updated 5 years ago
- Data Engineering on GCP☆30Updated 2 years ago
- The Ultimate Hands-On Hadoop - Tame your Big Data!: https://www.udemy.com/the-ultimate-hands-on-hadoop-tame-your-big-data/☆8Updated 5 years ago
- Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collab…☆36Updated 4 years ago
- PySpark Cheatsheet☆35Updated 2 years ago
- ☆112Updated 4 years ago
- ☆19Updated 5 years ago
- Educational notes,Hands on problems w/ solutions for hadoop ecosystem☆86Updated 5 years ago
- Classwork projects and home works done through Udacity data engineering nano degree☆74Updated last year
- Counting Tweets Per User in Real-Time☆41Updated 7 years ago
- ☆19Updated 6 years ago
- Code examples on Apache Spark using python☆106Updated 2 years ago
- Build & Learn Data Engineering,Machine Learning over Kubernetes. No Shortcut approach.☆58Updated 2 years ago
- Design/Implement stream/batch architecture on NYC taxi data | #DE☆26Updated 3 years ago
- Master Big Data With PySpark and AWS☆127Updated last year
- Deployed an kafka instance in AWS EC2 Instance to streamline the data into Databricks☆10Updated last year
- ☆61Updated last week
- Because its never late to start taking notes and 'public' it...☆60Updated 2 months ago
- This repo is mostly created for pyspark and hive related interview questions.☆46Updated 2 years ago
- Apache Spark using SQL☆14Updated 3 years ago
- ☆37Updated 5 years ago
- Hadoop tutorial Files. For detailed Tutorials visit www.youtube.com/learningjournalin☆26Updated 7 years ago