An end-to-end data engineering pipeline that fetches data from Wikipedia, cleans and transforms it with Apache Airflow and saves it on Azure Data Lake. Other processing takes place on Azure Data Factory, Azure Synapse and Tableau.
☆30Oct 2, 2023Updated 2 years ago
Alternatives and similar repositories for FootballDataEngineering
Users that are interested in FootballDataEngineering are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This project showcases how to integrate the world of DevOps, focusing on Continuous Integration (CI) and Continuous Deployment (CD) with …☆14Dec 27, 2023Updated 2 years ago
- This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project de…☆12Nov 18, 2023Updated 2 years ago
- In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our …☆39Dec 18, 2023Updated 2 years ago
- An end-to-end data engineering pipeline that fetches real-time YouTube analytics and streams them through Kafka for processing with ksqlD…☆16Sep 19, 2023Updated 2 years ago
- ☆15Feb 14, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Ka…☆329Feb 14, 2025Updated last year
- This project shows how to capture changes from postgres database and stream them into kafka☆42May 17, 2024Updated 2 years ago
- A data pipeline for processing football data using Python and SQL☆13Sep 12, 2023Updated 2 years ago
- Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake developme…☆12Feb 26, 2020Updated 6 years ago
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆51Dec 4, 2023Updated 2 years ago
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆48Dec 11, 2023Updated 2 years ago
- This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python…☆48Mar 14, 2024Updated 2 years ago
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆43Sep 26, 2023Updated 2 years ago
- ☆46Jul 6, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ms-dataverse is a Python module for Microsoft Dataverse, offering a lightweight ORM to query, create, update, and delete entities. Utiliz…☆13Apr 10, 2023Updated 3 years ago
- This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data wareh…☆215Oct 23, 2023Updated 2 years ago
- The official documentation of the City of Boston's Analytics Team.☆13Jan 21, 2025Updated last year
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆65Jul 21, 2023Updated 2 years ago
- ☆18Feb 1, 2025Updated last year
- ☆12Aug 8, 2023Updated 2 years ago
- ☆27Nov 26, 2025Updated 6 months ago
- ☆17Mar 10, 2025Updated last year
- Data pipeline from device to cloud☆11May 14, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This is a professional training designed by Google with 5 courses. This program also prepares one for the CompTIA A+ exams, the industry …☆13Nov 26, 2022Updated 3 years ago
- This is an end to end MLOps system☆34Nov 27, 2025Updated 6 months ago
- This repository showcases a collection of machine learning projects in various domains, demonstrating my skills and expertise as a data s…☆12Nov 20, 2023Updated 2 years ago
- Toolset for detecting reflected xss in websites☆16Oct 6, 2018Updated 7 years ago
- This project demonstrates real-time data streaming and processing architecture using Kafka, Spark Streaming, and Debezium for capturing C…☆14Oct 24, 2024Updated last year
- Transform data from on-premises SQL Server to Azure Delta Lake Storage for Analytics and Visualization☆26Jul 16, 2023Updated 2 years ago
- Scrapper and analyzer of shared scooter data☆11Jul 30, 2024Updated last year
- ☆24Dec 31, 2024Updated last year
- Data Structures and Algorithms☆22May 24, 2026Updated 2 weeks ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Python wrapper for Goodreads API☆30Feb 20, 2020Updated 6 years ago
- ☆28Jul 9, 2025Updated 11 months ago
- Functional Data Engineering tutorial in Python & Airflow.☆17Mar 24, 2023Updated 3 years ago
- collection of modules to build distributed and reliable concurrent systems in Python.☆207Sep 14, 2013Updated 12 years ago
- Local SQL Database ---> Azure ---> Power BI☆15Oct 13, 2023Updated 2 years ago
- ☆16Aug 5, 2023Updated 2 years ago
- With everything I learned from DEZoomcamp from datatalks.club, this project performs a batch processing on AWS for the cycling dataset wh…☆15Jan 4, 2026Updated 5 months ago