LuQQiu / DataPipeline
Real time stock data pipeline --play with Kafka, Cassandra, Spark, Redis, Node.js, Zookeeper
☆81Updated 7 years ago
Alternatives and similar repositories for DataPipeline:
Users that are interested in DataPipeline are comparing it to the libraries listed below
- Real-time Machine Learning with Apache Spark on Twitter Public Stream☆68Updated 7 years ago
- Realtime social media data analytics with Apache Spark, Python, Kafka, Pandas, etc☆51Updated 8 years ago
- PySpark Code for Hands-on Learners☆116Updated 5 years ago
- Twitter Sentiment Analysis using Spark and Kafka☆114Updated 5 years ago
- Design/Implement stream/batch architecture on NYC taxi data | #DE☆26Updated 3 years ago
- Real-time report dashboard with Apache Kafka, Apache Spark Streaming and Node.js☆50Updated last year
- A project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0☆25Updated 3 years ago
- Updated repository☆157Updated 3 years ago
- Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University☆155Updated last month
- Docker container for Kafka - Spark Streaming - Cassandra☆97Updated 5 years ago
- The demo of using Kafka, Spark, Hive, Cassandra, etc by using Docker. It produces the production ready environment for any kinds of big d…☆32Updated 5 years ago
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- Educational notes,Hands on problems w/ solutions for hadoop ecosystem☆86Updated 6 years ago
- Counting Tweets Per User in Real-Time☆41Updated 7 years ago
- A movie search engine based on ElasticSearch using Python☆18Updated 7 years ago
- Apache Spark (Scala, PySpark, SparkR) Code, Tricks, and References☆69Updated 6 years ago
- Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficient…☆55Updated 2 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- Ingest tweets with Kafka. Use Spark to track popular hashtags and trendsetters for each hashtag☆29Updated 8 years ago
- ETL pipeline using pyspark (Spark - Python)☆112Updated 4 years ago
- ☆53Updated 2 years ago
- Code examples on Apache Spark using python☆106Updated 2 years ago
- ☆105Updated 5 years ago
- Real-world Spark pipelines examples☆83Updated 6 years ago
- Materials for IBM Spark contest. About the real-world application of big data and spark.☆77Updated 6 years ago
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆30Updated 4 years ago
- Self-contained examples of Apache Spark streaming integrated with Apache Kafka.☆199Updated 6 years ago
- Examples To Help You Learn Apache Spark☆77Updated 6 years ago
- Apache Spark Interview Question and Answers☆21Updated 4 years ago
- Example blueprint application for processing high-speed trading data.☆84Updated last year