namebrandon / Sparkov_Data_Generation
Synthetic Credit Card Transaction Generator used in the Sparkov program.
☆148Updated 2 years ago
Alternatives and similar repositories for Sparkov_Data_Generation:
Users that are interested in Sparkov_Data_Generation are comparing it to the libraries listed below
- Financial Simulator of Mobile Money Service☆107Updated 3 years ago
- Reproducible Machine Learning for Credit Card Fraud Detection - Practical Handbook☆556Updated last year
- Joblib Apache Spark Backend☆245Updated 7 months ago
- AML End to End Example☆53Updated 2 years ago
- Record matching and entity resolution at scale in Spark☆34Updated last year
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆59Updated last year
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆30Updated 4 years ago
- A distributed Spark/Scala implementation of the isolation forest algorithm for unsupervised outlier detection, featuring support for scal…☆239Updated this week
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆67Updated 10 months ago
- Feast AWS guide using Redshift / Spectrum / DynamoDB to build a credit scoring model☆63Updated 3 years ago
- The data represents financial transactions -- bank transfers, purchases, credit card transactions, checks, etc. Most of the transactions…☆45Updated last year
- Data Anomaly and Fraud Detection with Python and R☆22Updated 6 years ago
- Template repo for kickstarting recipes for regression use case☆54Updated 3 months ago
- Fraudfinder: A comprehensive lab series on how to build a real-time fraud detection system on Google Cloud☆217Updated last week
- PySpark phonetic and string matching algorithms☆39Updated last year
- A series of Jupyter notebooks that walk you through Machine Learning with Apache Spark ecosystem using Spark MLlib, PyTorch and TensorFlo…☆81Updated last year
- A curated list of awesome open source tools and commercial products for monitoring data quality, monitoring model performance, and profil…☆74Updated 10 months ago
- Repo that relates to the Medium blog 'Keeping your ML model in shape with Kafka, Airflow' and MLFlow'☆119Updated last year
- Predict customer lifetime value using AutoML Tables, or ML Engine with a TensorFlow neural network and the Lifetimes Python library.☆168Updated 8 months ago
- ☆67Updated last year
- Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.☆103Updated 5 years ago
- Demo assets for DAIS 2021 'Learn to use Databricks for the full ML lifecycle' Talk☆13Updated 3 years ago
- Capturing model drift and handling its response - Example webinar☆108Updated 5 years ago
- Delta Lake Documentation☆49Updated 9 months ago
- Using a feature store to connect the DataOps and MLOps workflows to enable collaborative teams to develop efficiently.☆56Updated 2 years ago
- This repository implements a real-time credit card fraud detection pipeline using Kafka, Spark and Cassandra. Kafka continuously produces…☆20Updated 4 years ago
- 🐍 Quick reference guide to common patterns & functions in PySpark.☆511Updated 2 years ago
- Fundamentals of Spark with Python (using PySpark), code examples☆343Updated 2 years ago
- An example of an ETL pipeline that lays out generic DE processes. This is now out of date but still provides useful information☆27Updated 2 years ago
- Streaming Anomaly Detection Solution by using Pub/Sub, Dataflow, BQML & Cloud DLP☆181Updated last month