NeerajBhadani / bigdata-ml
☆23Updated 2 years ago
Alternatives and similar repositories for bigdata-ml:
Users that are interested in bigdata-ml are comparing it to the libraries listed below
- Repository used for Spark Trainings☆53Updated 2 years ago
- ☆84Updated 2 years ago
- Airflow training for the crunch conf☆105Updated 6 years ago
- ☆29Updated 4 years ago
- Guide for databricks spark certification☆58Updated 3 years ago
- ☆25Updated last year
- My Study guide used to pass the CRT020 Spark Certification exam☆33Updated 5 years ago
- Jupyter notebooks for pyspark tutorials given at University☆107Updated 4 months ago
- A repository for a PySpark Cookbook by Tomasz Drabas and Denny Lee☆59Updated 6 years ago
- ☆87Updated 2 years ago
- Essential PySpark for Scalable Data Analytics, published by Packt☆44Updated 2 years ago
- ☆16Updated 2 years ago
- A series of Jupyter notebooks that walk you through Machine Learning with Apache Spark ecosystem using Spark MLlib, PyTorch and TensorFlo…☆81Updated last year
- PySpark Cookbook, published by Packt☆91Updated 2 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆90Updated 3 years ago
- PDF DataSource for Apache Spark, allow to read PDF files directly to the DataFrame and ocr it☆50Updated last week
- Example repo to create end to end tests for data pipeline.☆23Updated 10 months ago
- The source code for the book Modern Data Engineering with Apache Spark☆36Updated 2 years ago
- Data Engineering with Spark and Delta Lake☆98Updated 2 years ago
- PySpark Cheatsheet☆36Updated 2 years ago
- PySpark data-pipeline testing and CICD☆28Updated 4 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆54Updated 2 years ago
- ETL pipeline using pyspark (Spark - Python)☆114Updated 5 years ago
- Example repo to kickstart integration with mlflow pipelines.☆76Updated 2 years ago
- ☆18Updated 6 years ago
- Spark app to merge different schemas☆23Updated 4 years ago
- A repository of sample code to show data quality checking best practices using Airflow.☆77Updated 2 years ago
- Big Data Demystified meetup and blog examples☆31Updated 8 months ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Updated last year
- Snowflake Cookbook, published by Packt☆79Updated 2 years ago