LuQQiu / DataPipeline
Real time stock data pipeline --play with Kafka, Cassandra, Spark, Redis, Node.js, Zookeeper
☆81Updated 8 years ago
Alternatives and similar repositories for DataPipeline:
Users that are interested in DataPipeline are comparing it to the libraries listed below
- Real-time report dashboard with Apache Kafka, Apache Spark Streaming and Node.js☆50Updated last year
- Real-time Machine Learning with Apache Spark on Twitter Public Stream☆68Updated 7 years ago
- Twitter Sentiment Analysis using Spark and Kafka☆115Updated 5 years ago
- A project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0☆25Updated 3 years ago
- Updated repository☆157Updated 3 years ago
- Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc☆51Updated 8 years ago
- PySpark Code for Hands-on Learners☆116Updated 5 years ago
- Apache Spark Interview Question and Answers☆20Updated 4 years ago
- Educational notes,Hands on problems w/ solutions for hadoop ecosystem☆87Updated 6 years ago
- A movie search engine based on ElasticSearch using Python☆18Updated 8 years ago
- Docker container for Kafka - Spark Streaming - Cassandra☆97Updated 5 years ago
- Sentiment Analysis of a Twitter Topic with Spark Structured Streaming☆55Updated 6 years ago
- ETL pipeline using pyspark (Spark - Python)☆113Updated 4 years ago
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- Analyze and visualize Twitter Sentiment on a world map using Spark MLlib☆138Updated 3 years ago
- Apache Spark 2x Machine Learning Cookbook, published by Packt☆29Updated 2 years ago
- Takes a kafka stream into spark, apply transformations and sink into Druid. Everything Dockerised.☆30Updated last year
- Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University☆156Updated 3 months ago
- Quickly set up a POC environment for Kafka+Spark☆16Updated 7 years ago
- 🚨 Simple, self-contained fraud detection system built with Apache Kafka and Python☆86Updated 5 years ago
- Various data stream/batch process demo with Apache Scala Spark 🚀☆11Updated 5 years ago
- Counting Tweets Per User in Real-Time☆42Updated 7 years ago
- Examples To Help You Learn Apache Spark☆77Updated 6 years ago
- Repository used for Spark Trainings☆53Updated last year
- Demonstration of using Apache Spark to build robust ETL pipelines while taking advantage of open source, general purpose cluster computin…☆24Updated last year
- This project describes how to write full ETL data pipeline using spark.☆15Updated 2 years ago
- PySpark Cookbook, published by Packt☆91Updated 2 years ago
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆29Updated last year
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆30Updated 4 years ago
- Final Project for IoT: Big Data Processing and Analytics class. Analyzing U.S nationwide temperature from IoT sensors in real-time☆70Updated 8 years ago