scalding-io / social-media-analytics
Social Media Data Mining and Analytics - HyperLogLog, BloomFilter and CountMinSketch with Scalding & Algebird
☆27Updated 5 years ago
Related projects: ⓘ
- ☆20Updated 7 years ago
- Additional useful algorithms that can be used with spark.☆24Updated 9 years ago
- Fraud Detection Online (Hadoop application)☆17Updated 10 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆21Updated 8 years ago
- Machine Learning over Twitter's stream. Using Apache Spark, Web Server and Lightning Graph server.☆27Updated 8 years ago
- Assembly of fundamental statistics implemented based on Apache Spark☆31Updated 8 years ago
- ☆27Updated this week
- Distributed Streaming Quantiles (for PySpark)☆37Updated 10 years ago
- Spark in Kaggle competitions☆9Updated 8 years ago
- Tweet Analysis with Spark☆15Updated 7 years ago
- Predicting The Stock Market using Time Series Analysis and Media☆11Updated 9 years ago
- Spark library for doing exploratory data analysis in a scalable way☆43Updated 8 years ago
- Templates for projects based on top of H2O.☆37Updated last year
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 8 years ago
- Building blocks and patterns for building data prep transformations and feature engineering in Spark.☆16Updated 8 years ago
- Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets☆91Updated 8 years ago
- ☆35Updated last year
- ☆19Updated 8 years ago
- Examples for Fast Data Processing with Spark☆59Updated 11 years ago
- This repository contains code files specifically IPython notebooks for the assignments in the course "Scalable Machine Learning" by UC Be…☆30Updated 9 years ago
- Word2Vec models with Twitter data using Spark. Blog:☆66Updated 5 years ago
- Sparse feature extraction with Spark☆29Updated 6 years ago
- A real time streaming implementation of markov chain based fraud detection☆24Updated 9 years ago
- Java implementation of the Microsoft's AdPredictor algorithm☆17Updated 6 years ago
- An API for Distributed Machine Learning☆154Updated 7 years ago
- Implementation of the Apriori algorithm using Spark.☆38Updated 9 years ago
- Process large amount of Twitter data using Spark SQL (and its JSON support). Answers questions like "What are the most popular languages?…☆9Updated 9 years ago
- Another, hopefully better, implementation of ALS on Spark☆14Updated 9 years ago
- A PredictionIO engine template using Latent Dirichlet Allocation to learn a topic model from raw text☆12Updated 8 years ago
- Experiments on english wikipedia. GloVe and word2vec.☆13Updated 8 years ago