This repo contains commands that data engineers use in day to day work.
☆62Feb 4, 2023Updated 3 years ago
Alternatives and similar repositories for TowardsDataEngineering
Users that are interested in TowardsDataEngineering are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- PySpark Cheatsheet☆36Jan 18, 2023Updated 3 years ago
- PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster☆491Oct 15, 2024Updated last year
- This repo is mostly created for pyspark and hive related interview questions.☆63Jan 6, 2026Updated 3 months ago
- All Data Engineering notebooks from Datacamp course☆116Dec 11, 2019Updated 6 years ago
- Data Engineering, Data Warehouse, Data Mart, Cloud Data, AWS, SAS, Redshift, S3☆32Feb 2, 2021Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Serious SQL is a Data With Danny virtual data apprenticeship program.☆22Sep 3, 2021Updated 4 years ago
- Notas das aulas da Aceleração Dev #4 da DIO sobre Engenharia de Dados, ministrado pela Everis.☆13Feb 6, 2021Updated 5 years ago
- This is an all-in-one repository for Data Engineers, ideal for beginners & interview preparation, which includes Python as the main Progr…☆30Mar 21, 2023Updated 3 years ago
- Data engineering interviews Q&A for data community by data community☆66Jun 7, 2020Updated 5 years ago
- ☆27Feb 2, 2018Updated 8 years ago
- Big Data Engineering practice project, including ETL with Airflow and Spark using AWS S3 and EMR☆90Jul 17, 2019Updated 6 years ago
- Case Study's from Danny Ma's Serious SQL Course☆19Aug 4, 2022Updated 3 years ago
- Fundamentals of Spark with Python (using PySpark), code examples☆363Oct 29, 2022Updated 3 years ago
- Data Engineering Bootcamp 2021☆13Aug 8, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Free resources for learning data science☆22May 6, 2018Updated 7 years ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Jan 22, 2024Updated 2 years ago
- Data Science Learning Notes☆11Oct 18, 2023Updated 2 years ago
- ☆15Feb 4, 2023Updated 3 years ago
- Personal project where I perform some analytics (including Sentiment Analysis) over a Twitter Stream using Big Data Technologies of the H…☆20Apr 14, 2023Updated 2 years ago
- Data set and queries that I use in my Hive and Impala presentations. Slides are usually posted at slideshare.net/markgrover☆20May 19, 2014Updated 11 years ago
- A tool to validate data, built around Apache Spark.☆101Feb 19, 2026Updated last month
- ☆10Nov 28, 2022Updated 3 years ago
- Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution☆19Aug 16, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Example end to end data engineering project.☆1,401Dec 8, 2022Updated 3 years ago
- Personal Data Engineering Projects☆1,004Feb 8, 2023Updated 3 years ago
- ☆11Jul 13, 2020Updated 5 years ago
- This repo contains all the code used in the Python for Data Engineering Course☆357Apr 24, 2024Updated last year
- Apache Airflow advanced functionalities examples☆21Mar 22, 2024Updated 2 years ago
- A Value Investment Strategy That Combines Security Selection And Market Timing Signals☆10Sep 8, 2019Updated 6 years ago
- Udacity Data Engineering Nanodegree Capstone Project☆37May 9, 2020Updated 5 years ago
- ☆13Oct 15, 2021Updated 4 years ago
- Easy application configuration with python☆11Feb 11, 2026Updated 2 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Real-time streaming data pipeline for Twitter Tweets☆10Jan 31, 2022Updated 4 years ago
- Collection of Databricks and Jupyter Notebooks☆22Feb 9, 2026Updated 2 months ago
- This is a simple Linear Regression implementation machine learning model and deployment of the same using flask. Data-set of Vadodara Hou…☆10Jan 8, 2020Updated 6 years ago
- A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from loc…☆23May 14, 2022Updated 3 years ago
- Collection of all the mini projects made by me so far.☆10Jan 4, 2022Updated 4 years ago
- Oi, meu nome é Brunna e eu gostaria de te ensinar como algoritmos aprendem de forma fácil e tudo em português.☆11Mar 27, 2020Updated 6 years ago
- This repo consists of all important concepts for data engineers.☆11Dec 24, 2024Updated last year