anjalysam / HadoopLinks
This contain how to install Hadoop on google colab and how to run map-reduce in Hadoop
☆33Updated 4 years ago
Alternatives and similar repositories for Hadoop
Users that are interested in Hadoop are comparing it to the libraries listed below
Sorting:
- Classwork projects and home works done through Udacity data engineering nano degree☆74Updated last year
- ☆25Updated 3 years ago
- Complete PySpark Guide for the beginners... I prepared this notebook for my students.☆18Updated 5 years ago
- An end-to-end project on customer segmentation☆81Updated 2 years ago
- Maternal Health Risk prediction MLOps pipeline☆43Updated 2 years ago
- Here I will be exploring various tools and methods that are used in data engineering process with Python.☆22Updated 4 years ago
- This program provides the skills you need to advance your career in data engineering and recommends training to support your preparation …☆20Updated 2 years ago
- Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in t…☆31Updated last year
- This is code depository for my upcoming session. Will update details post the session☆40Updated 2 years ago
- This repo contains all the material developed during the 9-week bootcamp provided by DPhi in colaboration with DataTalks Club☆21Updated 2 years ago
- Deep Learning Projects on TensorFlow and Keras☆20Updated last year
- An End-to-End Implementation of AutoML with H2O, MLflow, FastAPI, and Streamlit for Insurance Cross-Sell☆77Updated 3 years ago
- A quick reference guide to the most commonly used patterns and functions in PySpark SQL☆55Updated 3 years ago
- Machine Learning Model Serving Patterns and Best Practices☆35Updated last year
- Machine Learning Ops Project☆29Updated last year
- An end-to-end project on customer segmentation☆20Updated 3 years ago
- Mastering Big Data Analytics with PySpark, Published by Packt☆160Updated 10 months ago
- Useful data science and Python code snippets at Data Science Simplified☆72Updated 3 years ago
- ☆35Updated 2 years ago
- Machine Learning for Streaming Data with Python, published by Packt☆71Updated last year
- The practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.☆52Updated 2 years ago
- Course Material - Data Science Program☆14Updated last year
- A course by DataTalks Club that covers Spark, Kafka, Docker, Airflow, Terraform, DBT, Big Query etc☆13Updated 3 years ago
- Processing TfL data for bike usage with Google Cloud Platform.☆45Updated 2 years ago
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆37Updated last year
- Comet for Data Science, published by Packt☆42Updated last year
- Final Project of the MLOps Zoomcamp hosted by DataTalksClub.☆26Updated 2 years ago
- ☆29Updated 2 years ago
- Slides for "Feature engineering for time series forecasting" talk☆60Updated 2 years ago
- Mastering Machine Learning on AWS, published by Packt☆46Updated 2 years ago