AdeboyeML / UK_Accident_Traffic_ETL_PipelineView external linksLinks
This is a capstone project that entails building an end-to-end ETL (Extract-Transform-Load) Data pipeline which extracts UK accident and traffic datasets from Amazon S3, clean and transform with Pyspark, transfer it back to S3 and finally load to Amazon Redshift (Distributed Database), from where the data can be queried for ad-hoc analyses.
☆18Jun 6, 2020Updated 5 years ago
Alternatives and similar repositories for UK_Accident_Traffic_ETL_Pipeline
Users that are interested in UK_Accident_Traffic_ETL_Pipeline are comparing it to the libraries listed below
Sorting:
- implementing an end-to-end tweets ETL/Analysis pipeline.☆59Dec 8, 2022Updated 3 years ago
- Primer curso de Craftech Academy - Marzo 2021☆11Aug 3, 2021Updated 4 years ago
- ☆12Sep 19, 2021Updated 4 years ago
- ☆10Jan 3, 2022Updated 4 years ago
- ZINDI GIZ NLP Agricultural Keyword Spotter 3rd place solution, Audio Classification☆11Sep 8, 2021Updated 4 years ago
- Data mining algorithms with Python☆10Jun 26, 2019Updated 6 years ago
- The gaming industry is certainly one of the thriving industries of the modern age and one of those that are most influenced by the advanc…☆12Jun 29, 2020Updated 5 years ago
- Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark☆11May 22, 2018Updated 7 years ago
- All Coding project for CS6515 GA☆14Jul 22, 2022Updated 3 years ago
- ☆15Dec 2, 2020Updated 5 years ago
- This repository contains everything you need to become proficient in System Design and Case Studies with Code Implementation☆18Jan 27, 2024Updated 2 years ago
- ETL using Python in Jupyter Notebook, loading CSV, cleaning data, and saving to SQL Database.☆14Nov 17, 2020Updated 5 years ago
- Docker image that includes "official" OpenJdk, Python, Maven and stitches in jpy☆12Sep 24, 2020Updated 5 years ago
- Package for Computational Biology Reading Group☆13Apr 20, 2022Updated 3 years ago
- Build Your Own Roadmap☆11Jul 8, 2020Updated 5 years ago
- ☆13Jun 23, 2022Updated 3 years ago
- Teaching notes from my Advanced SQL workshops as local lead instructor at General Assembly New York. The first edition was created for th…☆18Feb 14, 2020Updated 6 years ago
- Guide to CS Engineering and Interview Prep☆18Dec 26, 2024Updated last year
- Roadmap to becoming a web developer in 2017 in spanish, Roadmap para ser un desarrollador web en el 2017☆15Jun 16, 2017Updated 8 years ago
- ☆14Aug 9, 2016Updated 9 years ago
- Steven's 100DaysOfCloudRepo☆17Nov 22, 2020Updated 5 years ago
- This repository contains the 2nd place solution for the GIZ NLP word spotter competition organized by Zindi.☆14Dec 3, 2020Updated 5 years ago
- data visualizations and R code for #TidyTuesday 2021☆16Feb 4, 2022Updated 4 years ago
- A repo to track data engineering projects☆13Nov 11, 2022Updated 3 years ago
- ☆16Mar 5, 2025Updated 11 months ago
- A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousin…☆15Apr 29, 2021Updated 4 years ago
- Collection of notebooks☆17Oct 27, 2024Updated last year
- This Challenge aims to infer important COVID-19 public health risk factors from outdated data in South Africa☆20Dec 8, 2022Updated 3 years ago
- This repository explains how to predict customer churn. An Hackathon Organized by Data Science Nigeria(DSN-AI) to help Expresso predict c…☆21Oct 17, 2021Updated 4 years ago
- Time Series Anomaly Detection using a Kolmogorov-Arnold Network☆26May 21, 2025Updated 8 months ago
- Apache Spark 3 for Data Engineering and Analytics with Python , By Packt publishing☆24Jul 23, 2023Updated 2 years ago
- My Git Repo for Csv Data☆21Oct 5, 2025Updated 4 months ago
- An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.☆1,485Mar 9, 2020Updated 5 years ago
- Tweepy Stream Example☆19Apr 23, 2019Updated 6 years ago
- Demonstration of using Apache Spark to build robust ETL pipelines while taking advantage of open source, general purpose cluster computin…☆24Aug 11, 2023Updated 2 years ago
- This is where we put useful code for our daily job with data.☆27Mar 19, 2025Updated 10 months ago
- A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract d…☆24Nov 22, 2021Updated 4 years ago
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆29Aug 8, 2020Updated 5 years ago
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆29Aug 14, 2023Updated 2 years ago