jfchen / Spark-SQL-Twitter-Analyzer
Process large amount of Twitter data using Spark SQL (and its JSON support). Answers questions like "What are the most popular languages?", "Who is most influential?", "Which time zones are most active during a day?" and more.
☆9Updated 9 years ago
Related projects: ⓘ
- Tutorial for Deploying Anaconda Cluster and PySpark on top of Red Hat Storage GlusterFS☆8Updated 9 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆21Updated 8 years ago
- A spark sbt blueprint to build your own spark apps off of (for cloud native runtime, see the kube/spark examples)☆55Updated 5 years ago
- Real-time dashboard for Twitter Sentiment analysis using Spark Streaming and Watson Tone Analyzer☆31Updated 5 years ago
- A real time streaming implementation of markov chain based fraud detection☆24Updated 9 years ago
- Spark in Kaggle competitions☆9Updated 8 years ago
- Additional useful algorithms that can be used with spark.☆24Updated 9 years ago
- Fast-Data-Processing-with-Spark-2☆22Updated last year
- Coding exercises for Apache Spark☆103Updated 9 years ago
- A demo of how to use PageRank with Hadoop and SociaLite to identify anomalies in Healthcare Data☆47Updated 8 years ago
- Assembly of fundamental statistics implemented based on Apache Spark☆31Updated 8 years ago
- The code for the in memory data pipeline that was presented at Berlin Buzzwords 2015.☆10Updated 9 years ago
- Building blocks and patterns for building data prep transformations and feature engineering in Spark.☆16Updated 8 years ago
- Examples for Fast Data Processing with Spark☆59Updated 11 years ago
- Spark Tutorial at the University of Maryland☆38Updated 9 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 8 years ago
- Machine Learning over Twitter's stream. Using Apache Spark, Web Server and Lightning Graph server.☆27Updated 8 years ago
- ☆24Updated 9 years ago
- An Apache Spark app for making data movement between Apache Hive and Apache Phoenix/HBase☆14Updated 8 years ago
- This repository contains code files specifically IPython notebooks for the assignments in the course "Scalable Machine Learning" by UC Be…☆30Updated 9 years ago
- Training materials for Strata, AMP Camp, etc☆150Updated 8 years ago
- ☆35Updated 8 years ago
- tutorials and samples that show you how get the most out of IBM Analytics for Apache Spark☆79Updated 6 years ago
- Example project to show how to use Spark to read and write Avro/Parquet files☆50Updated 11 years ago
- ☆20Updated 7 years ago
- ☆21Updated this week
- DEPRECATED! Use https://github.com/h2oai/sparkling-water repository! H2O and Spark interoperability based on Tachyon.☆44Updated 9 years ago
- Code examples supporting the "Introduction to Apache Spark" video published by O'Reilly Media☆37Updated 2 years ago
- A command line tool for Spark packages☆19Updated last year
- PMML evaluator library for the Apache Hive data warehouse software (legacy codebase)☆13Updated 9 years ago