noi-techpark / big-data-for-tourism
☆12Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for big-data-for-tourism
- Data warehouse implementation for an e-commerce website “Infibeam” that sells digital and consumer electronics.☆18Updated 6 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- 😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS☆43Updated 5 years ago
- Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.☆94Updated 3 years ago
- 4 different Big Datasets joined to get single table for final data analysis. Fraud Detection by taken consideration of different key feat…☆44Updated 4 years ago
- Project - Data Processing and Analysis in Python Course☆41Updated 6 years ago
- This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which…☆92Updated 3 months ago
- The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on…☆26Updated 2 years ago
- Playground for pyspark (RDDs, DStreams) and Apache Airflow. Based on the example of parsing (including incorrectly formated strings) web …☆16Updated 2 years ago
- Data Engineering, Data Warehouse, Data Mart, Cloud Data, AWS, SAS, Redshift, S3☆26Updated 3 years ago
- ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)☆14Updated 5 years ago
- PySpark Cheatsheet☆35Updated last year
- Big data projects implemented by Maniram yadav☆50Updated 6 years ago
- PySpark Tutorial for Beginners - Practical Examples in Jupyter Notebook with Spark version 3.4.1. The tutorial covers various topics like…☆83Updated last year
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as …☆16Updated 5 years ago
- ☆86Updated 2 years ago
- data-warehouse-snowflake-for-data-engineering☆14Updated last year
- Personal project where I perform some analytics (including Sentiment Analysis) over a Twitter Stream using Big Data Technologies of the H…☆20Updated last year
- Simple ETL pipeline using Python☆21Updated last year
- Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in t…☆28Updated 7 months ago
- Step by step instructions to create a production-ready data pipeline☆27Updated 2 months ago
- Big Data webapp using Chicago street congestion, crashes, red light violations, and speed camera violations☆38Updated 3 years ago
- Deployed an kafka instance in AWS EC2 Instance to streamline the data into Databricks☆10Updated last year
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆56Updated 5 months ago
- plan, design and implement enterprise data infrastructure solutions and create the blueprints for an organization’s data management syste…☆10Updated last year
- Build an scikit-learn model to predict churn using customer telco data.☆14Updated last year
- PySpark functions and utilities with examples. Assists ETL process of data modeling☆99Updated 3 years ago
- ☆19Updated last year