caizkun / mapreduce-examplesLinks
A collection of mapreduce problems and solutions
☆35Updated 7 years ago
Alternatives and similar repositories for mapreduce-examples
Users that are interested in mapreduce-examples are comparing it to the libraries listed below
Sorting:
- Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.☆96Updated 4 years ago
- I am using confluent Kafka cluster to produce and consume scraped data. In this project, I've created a real-time data pipeline that uti…☆29Updated 2 years ago
- Counting Tweets Per User in Real-Time☆42Updated 7 years ago
- This repository contains Spark, MLlib, PySpark and Dataframes projects☆46Updated 7 years ago
- All my projects on Big Data are provided☆27Updated 8 years ago
- ☆53Updated 2 years ago
- Final Project for IoT: Big Data Processing and Analytics class. Analyzing U.S nationwide temperature from IoT sensors in real-time☆71Updated 8 years ago
- Because its never late to start taking notes and 'public' it...☆59Updated 3 weeks ago
- Big data projects implemented by Maniram yadav☆51Updated 7 years ago
- Classwork projects and home works done through Udacity data engineering nano degree☆74Updated last year
- Data Engineering Capstone Project: ETL Pipelines and Data Warehouse Development☆21Updated 5 years ago
- ETL pipeline using pyspark (Spark - Python)☆117Updated 5 years ago
- Data Engineering, Data Warehouse, Data Mart, Cloud Data, AWS, SAS, Redshift, S3☆30Updated 4 years ago
- This repository implements a real-time credit card fraud detection pipeline using Kafka, Spark and Cassandra. Kafka continuously produces…☆20Updated 4 years ago
- My cheat sheets.☆44Updated 5 years ago
- Mastering Big Data Analytics with PySpark, Published by Packt☆160Updated 10 months ago
- This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which…☆99Updated 11 months ago
- ☆150Updated 7 years ago
- ☆87Updated 2 years ago
- Apache Spark Course Material☆93Updated 2 years ago
- Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution☆68Updated 4 years ago
- Udacity Data Engineering Nanodegree Program☆52Updated 4 years ago
- Data pipeline project☆35Updated 4 months ago
- Walkthrough notebooks for Deep Learning, Machine Learning, Reinforcement Learning, Spark, Statistics, Algorithms, Scala, Python☆69Updated last year
- notebooks produced throughout the Udacity's Nanodegree Data Engineering Course☆73Updated 4 years ago
- Solution to all projects of Udacity's Data Engineering Nanodegree: Data Modeling with Postgres & Cassandra, Data Warehouse with Redshift,…☆57Updated 2 years ago
- This repository contains code for Spark Streaming☆22Updated 4 years ago
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆45Updated last year
- ☆14Updated 5 years ago
- My solutions for the Udacity Data Engineering Nanodegree☆34Updated 5 years ago