This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark clusters are set up within a Docker container on Azure.
☆12Oct 11, 2023Updated 2 years ago
Alternatives and similar repositories for Japan-visa-data-engineering
Users that are interested in Japan-visa-data-engineering are comparing it to the libraries listed below
Sorting:
- An end-to-end data engineering pipeline that fetches real-time YouTube analytics and streams them through Kafka for processing with ksqlD…☆16Sep 19, 2023Updated 2 years ago
- An end-to-end data engineering pipeline that fetches data from Wikipedia, cleans and transforms it with Apache Airflow and saves it on Az…☆32Oct 2, 2023Updated 2 years ago
- This repository contains the necessary configuration files and DAGs (Directed Acyclic Graphs) for setting up a robust data engineering en…☆25Jan 26, 2024Updated 2 years ago
- This project showcases how to integrate the world of DevOps, focusing on Continuous Integration (CI) and Continuous Deployment (CD) with …☆15Dec 27, 2023Updated 2 years ago
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆43Jan 4, 2024Updated 2 years ago
- This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project de…☆11Nov 18, 2023Updated 2 years ago
- This project shows how to capture changes from postgres database and stream them into kafka☆41May 17, 2024Updated last year
- This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data wareh…☆206Oct 23, 2023Updated 2 years ago
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆45Dec 11, 2023Updated 2 years ago
- In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our …☆39Dec 18, 2023Updated 2 years ago
- An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Ka…☆316Feb 14, 2025Updated last year
- ☆13Apr 18, 2024Updated last year
- ☆16Feb 20, 2026Updated last week
- Repository to host micro service implementation patterns.☆13Jun 25, 2025Updated 8 months ago
- This project aims to build a traveling recommendation application using Google Places API and OpenAI LLM.☆11Mar 19, 2024Updated last year
- Solved data engineering exercises using Pyspark☆15Aug 2, 2021Updated 4 years ago
- A data pipeline for processing football data using Python and SQL☆13Sep 12, 2023Updated 2 years ago
- This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python…☆48Mar 14, 2024Updated last year
- Classwork projects and home works done through Udacity data engineering nano degree☆10Jun 6, 2021Updated 4 years ago
- The project focuses on the drowsiness of IT employees, drivers, pilots, crane operators, student etc. These people need a system which ca…☆14Sep 13, 2018Updated 7 years ago
- The repository includes detailed steps to get data from GES DISC, convert HDF5 files to CSV and plotting geographic data.☆11Aug 17, 2020Updated 5 years ago
- Acquiring and processing information on world's largest banks☆17Jun 17, 2025Updated 8 months ago
- Sample Spring Boot project implementing a REST CRUD application☆16Oct 19, 2021Updated 4 years ago
- ☆10Jan 18, 2024Updated 2 years ago
- This is an example of using MongoDB as both a source and sink.☆10May 21, 2020Updated 5 years ago
- ☆11Nov 9, 2022Updated 3 years ago
- This project leverages Hadoop, Spark, SQL, and Hive for efficient data integration, transformation, warehousing, and analytics. It provid…☆21Sep 30, 2023Updated 2 years ago
- Project on belief embedding☆20Jun 4, 2025Updated 9 months ago
- ☆11Aug 11, 2022Updated 3 years ago
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆48Dec 4, 2023Updated 2 years ago
- ☆10Jan 8, 2024Updated 2 years ago
- Google Advanced Data Analytics Coursera☆12Jul 2, 2023Updated 2 years ago
- ☆14May 14, 2024Updated last year
- This data project can be used as a take-home assignment to learn Pyspark and Data Engineering.☆17Feb 19, 2023Updated 3 years ago
- Transparent sandbox for integration testing against AWS services. Test your infrastructure without changes to your Terraform files or you…☆12Oct 26, 2023Updated 2 years ago
- ETL using Python in Jupyter Notebook, loading CSV, cleaning data, and saving to SQL Database.☆14Nov 17, 2020Updated 5 years ago
- Automatically backing up your Postgres database using NodeJS☆13Nov 14, 2020Updated 5 years ago
- My leetcode solutions☆11Jan 11, 2023Updated 3 years ago
- GPT-4o Powered Calorie Detecor☆18May 29, 2024Updated last year