pranab / beymani
Hadoop, Spark and Storm based anomaly detection implementations for data quality, cyber security, fraud detection etc.
☆129Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for beymani
- ☆48Updated 8 years ago
- Anomaly Detection model uses Spark for training and Spark Streaming for testing☆66Updated 8 years ago
- Detecting outliers in a dataset using Spark☆41Updated 8 years ago
- ☆35Updated 8 years ago
- A spark package for loading Spark ML models to Redis-ML☆63Updated 5 years ago
- A Real-Time Analytical Processing (RTAP) example using Spark/Shark☆51Updated 10 years ago
- Apache Spark (Scala, PySpark, SparkR) Code, Tricks, and References☆70Updated 5 years ago
- SAMOA (Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams.☆427Updated 8 years ago
- A real time streaming implementation of markov chain based fraud detection☆24Updated 9 years ago
- Distributed, streaming anomaly detection and prediction with HTM in Apache Flink☆136Updated 7 years ago
- Coding exercises for Apache Spark☆104Updated 9 years ago
- Building blocks and patterns for building data prep transformations and feature engineering in Spark.☆16Updated 8 years ago
- Code for Packt Publishing's Spark for Data Science Cookbook.☆22Updated 7 years ago
- Structured Streaming Machine Learning example with Spark 2.0☆92Updated 7 years ago
- Csv2Hive is an useful CSV schema finder for the Big Data. It discovers automatically schemas in big CSV files, generates the 'CREATE TABL…☆27Updated 7 years ago
- Additional useful algorithms that can be used with spark.☆24Updated 9 years ago
- Kaggle's click through rate prediction with Spark Pipeline API☆23Updated 8 years ago
- ☆41Updated 8 years ago
- Helpful user defined fuctions / table generating functions for Hive☆101Updated 8 years ago
- Anomaly detection framework @ PayPal☆107Updated 5 years ago
- HDP Data Science/Machine Learning demo☆37Updated 9 years ago
- Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Str…☆110Updated last year
- helpful resources for (big) data science☆33Updated 3 years ago
- This project provides sequential pattern mining for Apache Spark. The algorithms are based on the work of Philippe Fournier-Viger and co…☆29Updated 9 years ago
- Mastering Spark for Data Science, published by Packt☆46Updated last year
- Used Spark core python, Spark sql, Spark MLlib, Spark Streaming☆46Updated 3 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆66Updated 8 years ago