cherkavi / cheat-sheet
collection of cheat sheets
☆368Updated this week
Alternatives and similar repositories for cheat-sheet
Users that are interested in cheat-sheet are comparing it to the libraries listed below
Sorting:
- Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collab…☆37Updated 5 years ago
- Build & Learn Data Engineering,Machine Learning over Kubernetes. No Shortcut approach.☆57Updated 2 years ago
- Building Big Data Pipelines with Apache Beam, published by Packt☆86Updated 2 years ago
- Educational notes,Hands on problems w/ solutions for hadoop ecosystem☆87Updated 6 years ago
- ☆65Updated 2 weeks ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆46Updated last year
- Public Docker Images for popular services☆31Updated 2 months ago
- Creating Data Pipelines with Apache Airflow to manage ETL from Amazon S3 into Amazon Redshift☆14Updated 5 years ago
- Spark data pipeline that processes movie ratings data.☆28Updated last month
- Simple repo to demonstrate how to submit a spark job to EMR from Airflow☆33Updated 4 years ago
- Docker with Airflow and Spark standalone cluster☆257Updated last year
- Simple stream processing pipeline☆102Updated 10 months ago
- Start your Google Cloud Journey with 150 practical demos. Google Cloud Associate Cloud Engineer certification - GCP ACE☆60Updated 11 months ago
- Create Data Lake on AWS S3 to store dimensional tables after processing data using Spark on AWS EMR cluster☆9Updated 5 years ago
- 80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Sp…☆795Updated 3 weeks ago
- Materials of the Official Helm Chart Webinar☆27Updated 3 years ago
- O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian☆215Updated last year
- RedditR for Content Engagement and Recommendation☆22Updated 7 years ago
- data engineering 100 days 🤖 🧲 🦾 | #DE☆40Updated last year
- Data engineering interviews Q&A for data community by data community☆63Updated 4 years ago
- Data Engineering with Spark and Delta Lake☆98Updated 2 years ago
- This repository contains my solutions to the top 50 LeetCode SQL challenges implemented using PySpark DataFrame and PySpark SQL.☆17Updated last year
- This project shows how to capture changes from postgres database and stream them into kafka☆36Updated last year
- Code base for airflow training series Getting easy with Apache Airflow☆39Updated last year
- Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in t…☆30Updated last year
- This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which…☆98Updated 9 months ago
- ☆143Updated last year
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆44Updated 2 years ago
- This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python…☆42Updated last year
- Repo for Introduction to Iceberg Video☆18Updated 11 months ago