Create a data pipeline on AWS to execute batch processing in a Spark cluster provisioned by Amazon EMR. ETL using managed airflow: extracts data from S3, transform data using spark, load transformed data back to S3.
☆10Jul 12, 2021Updated 4 years ago
Alternatives and similar repositories for Batch-ETL-with-AWS-EMR-and-MWAA
Users that are interested in Batch-ETL-with-AWS-EMR-and-MWAA are comparing it to the libraries listed below
Sorting:
- This project aims to rate football players using data and statistics recorded from the last match they participated in. Much of the code …☆12Nov 22, 2021Updated 4 years ago
- ☆22Jul 29, 2024Updated last year
- ☆34Feb 19, 2026Updated 2 weeks ago
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆29Aug 8, 2020Updated 5 years ago
- Spark data pipeline that processes movie ratings data.☆31Mar 1, 2026Updated last week
- Solution to Data at ANZ virtual internship on Forage☆10May 30, 2021Updated 4 years ago
- ☆10Sep 22, 2020Updated 5 years ago
- Project evaluating player skill under pressure using Statsbomb public event level data☆11Jun 22, 2019Updated 6 years ago
- Python client for Radarly API☆10Aug 3, 2023Updated 2 years ago
- JSON Schema Validition for the Soccer Common Data Format☆14Dec 16, 2025Updated 2 months ago
- Plotting StatsBomb Freeze Frame shot data☆10Jun 24, 2019Updated 6 years ago
- Implementation of the construction of team models representing the offensive style of play of soccer teams, and analysis based on these m…☆11Oct 17, 2022Updated 3 years ago
- A few examples of making football analytical visualisations using Python and Matplotlib☆10Aug 10, 2021Updated 4 years ago
- Match previews for 1770 games in the English Premier League☆10Aug 27, 2020Updated 5 years ago
- Finding out the most important stats for a midfielder☆12Sep 9, 2023Updated 2 years ago
- Fbref is a popular football stats/metrics site which collections information from Ted Knutson's StatsBomb. You can save this data in a cs…☆11Sep 20, 2021Updated 4 years ago
- optasoccer is a Python library for reading opta soccer data☆11Mar 14, 2024Updated last year
- Albumentations Data Augmentation Plugin for FiftyOne!☆14Aug 22, 2024Updated last year
- Tutorials and talks held from PyData London 2019☆12Nov 22, 2022Updated 3 years ago
- Code to scrape CVPR website for list of accepted papers, find their arXiv links, extract metadata, and download pdfs☆10Jun 12, 2024Updated last year
- In this project I used apache airflow to scrape website periodically. This is for the tutorials I do on youtube.☆10Nov 21, 2022Updated 3 years ago
- This is an R translated version of Devin Pleuler's Applied Soccer Analytics with Python☆10Mar 14, 2019Updated 6 years ago
- Implemeting Meta AI's VGGT as a FiftyOne Remote Zoo Model☆20Jun 20, 2025Updated 8 months ago
- CLV prediction with pareto-NBD model☆12Jul 1, 2016Updated 9 years ago
- ☆10Sep 10, 2022Updated 3 years ago
- Python environment setup and customizations.☆12Apr 27, 2021Updated 4 years ago
- Data and regressions on Premier League teams from 2000-01 through to 2016-17☆11Jul 31, 2017Updated 8 years ago
- Notes from my data science escapades in Python.☆12Jun 17, 2024Updated last year
- ☆13May 23, 2024Updated last year
- A growing collection of college basketball visualizations and their source code☆14Dec 30, 2023Updated 2 years ago
- Contains code and content output for assignments for the Uppsala University Mathematical Modelling of Football course.☆14Nov 15, 2020Updated 5 years ago
- Fast and Computationally efficient Continual Learning for NanoDet anchor-free Object Detector☆12Dec 16, 2024Updated last year
- ⚽🦁☆12Jan 13, 2026Updated last month
- Interactive R tutorials for SPMC350 at the University of Nebraska-Lincoln's College of Journalism and Mass Communications☆14Jan 5, 2026Updated 2 months ago
- A python package that is a wrapper for Plotly to generate football tracking and event data plots☆13Aug 11, 2021Updated 4 years ago
- Files used for Senior Thesis (Yale University, Spring 2019)☆11Oct 3, 2019Updated 6 years ago
- Code for my bachelor thesis☆14Dec 6, 2023Updated 2 years ago
- Introduction to Git and Github using RStudio and R☆13Feb 3, 2025Updated last year
- ☆12Sep 9, 2021Updated 4 years ago