angelddaz / de-challengesLinks
Project based learning for Data Engineering fundamentals.
☆13Updated 4 years ago
Alternatives and similar repositories for de-challenges
Users that are interested in de-challenges are comparing it to the libraries listed below
Sorting:
- pyspark dataframe made easy☆16Updated 3 years ago
- Apache Spark Guide☆34Updated 3 years ago
- Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups☆18Updated 7 years ago
- A repo to track data engineering projects☆13Updated 2 years ago
- This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I r…☆20Updated 4 years ago
- ☆11Updated 3 years ago
- Data Engineering pipeline hosted entirely in the AWS ecosystem utilizing DocumentDB as the database☆14Updated 4 years ago
- ☆18Updated 7 years ago
- Explore 120 million taxi trips in real time with Dash and Vaex☆117Updated 5 years ago
- Data engineering interviews Q&A for data community by data community☆64Updated 5 years ago
- 🐍💨 Airflow tutorial for PyCon 2019☆86Updated 2 years ago
- Code and notebooks containing my experiments in data science, EDA, visualization, and machine learning☆27Updated 2 years ago
- PySpark functions and utilities with examples. Assists ETL process of data modeling☆104Updated 4 years ago
- Challenge for those applying to the Software Engineer, Big Data position☆35Updated 14 years ago
- Jupyter Notebook and Python business intelligence tools and techniques. [Raw upload]☆86Updated 2 years ago
- [Video]AWS Certified Machine Learning-Specialty (ML-S) Guide☆122Updated 9 months ago
- ∞ Priceloop Engineering Conventions for Scala, Python, Git Workflow etc☆100Updated 3 years ago
- A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for …☆139Updated 5 years ago
- ☆54Updated 2 years ago
- This repo is meant to make it really easy to analyze the interplays between health and social media use.☆46Updated 3 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆90Updated 3 years ago
- (project & tutorial) dag pipeline tests + ci/cd setup☆88Updated 4 years ago
- A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics …☆20Updated 3 years ago
- Big Data Demystified meetup and blog examples☆31Updated last year
- Best practices for engineering ML pipelines.☆36Updated 3 years ago
- The goal of this project is to build an RL-based algorithm that can help cab drivers maximize their profits by improving their decision-m…☆14Updated 4 years ago
- ☆18Updated 4 years ago
- Snowflake Cookbook, published by Packt☆81Updated 2 years ago
- Cuttle automates the transformation of your Python notebook into deployment-ready projects (API, ML pipeline, or just a Python script)☆49Updated 3 years ago
- Useful data science and Python code snippets at Data Science Simplified☆72Updated 4 years ago