An end-to-end data engineering pipeline that fetches data from Wikipedia, cleans and transforms it with Apache Airflow and saves it on Azure Data Lake. Other processing takes place on Azure Data Factory, Azure Synapse and Tableau.
☆30Oct 2, 2023Updated 2 years ago
Alternatives and similar repositories for FootballDataEngineering
Users that are interested in FootballDataEngineering are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark cluste…☆11Oct 11, 2023Updated 2 years ago
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆45Jan 4, 2024Updated 2 years ago
- This project showcases how to integrate the world of DevOps, focusing on Continuous Integration (CI) and Continuous Deployment (CD) with …☆14Dec 27, 2023Updated 2 years ago
- ☆15Feb 14, 2025Updated last year
- Includes all the Practice Material and Project☆30May 19, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Ka…☆331Feb 14, 2025Updated last year
- This project shows how to capture changes from postgres database and stream them into kafka☆42May 17, 2024Updated 2 years ago
- Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake developme…☆12Feb 26, 2020Updated 6 years ago
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆50Dec 4, 2023Updated 2 years ago
- This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python…☆48Mar 14, 2024Updated 2 years ago
- ☆46Jul 6, 2024Updated last year
- Snowflake - Build and Architect Data Pipelines using AWS, published by Packt☆24Apr 3, 2023Updated 3 years ago
- ms-dataverse is a Python module for Microsoft Dataverse, offering a lightweight ORM to query, create, update, and delete entities. Utiliz…☆13Apr 10, 2023Updated 3 years ago
- This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data wareh…☆217Oct 23, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆18Feb 1, 2025Updated last year
- Creates csv files containing football data scraped from the website www.fbref.com☆23Jul 4, 2023Updated 2 years ago
- ☆17Mar 10, 2025Updated last year
- Data pipeline from device to cloud☆11May 14, 2022Updated 4 years ago
- This repository showcases a collection of machine learning projects in various domains, demonstrating my skills and expertise as a data s…☆12Nov 20, 2023Updated 2 years ago
- This project demonstrates real-time data streaming and processing architecture using Kafka, Spark Streaming, and Debezium for capturing C…☆15Oct 24, 2024Updated last year
- Transform data from on-premises SQL Server to Azure Delta Lake Storage for Analytics and Visualization☆26Jul 16, 2023Updated 2 years ago
- Scrapper and analyzer of shared scooter data☆11Jul 30, 2024Updated last year
- ☆24Dec 31, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆34Nov 14, 2024Updated last year
- ☆12Jan 14, 2023Updated 3 years ago
- ☆28Jul 9, 2025Updated 11 months ago
- ☆19May 11, 2023Updated 3 years ago
- Functional Data Engineering tutorial in Python & Airflow.☆17Mar 24, 2023Updated 3 years ago
- collection of modules to build distributed and reliable concurrent systems in Python.☆207Sep 14, 2013Updated 12 years ago
- Local SQL Database ---> Azure ---> Power BI☆15Oct 13, 2023Updated 2 years ago
- ☆16Aug 5, 2023Updated 2 years ago
- With everything I learned from DEZoomcamp from datatalks.club, this project performs a batch processing on AWS for the cycling dataset wh…☆15Jan 4, 2026Updated 5 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆23May 13, 2019Updated 7 years ago
- Repository for Data Engineering Interview Series☆40Oct 17, 2024Updated last year
- Add .waitForUrl() to Nightmare☆10Aug 23, 2016Updated 9 years ago
- This formatter which is for handling parameters and file uploaded to Web API controller.☆26Dec 7, 2022Updated 3 years ago
- This project leverages GCS, Composer, Dataflow, BigQuery, and Looker on Google Cloud Platform (GCP) to build a robust data engineering so…☆36Dec 12, 2023Updated 2 years ago
- Data Engineering Project: Extracting music video metrics of Twice using YouTube API, AWS, and Tableau☆32Nov 21, 2023Updated 2 years ago
- ☆15Aug 3, 2022Updated 3 years ago